flashblade

22 Topics

Enabling Agentic AI via Pure1 Manage MCP Server
Everpure now offers a Pure1® Manage MCP Server so you can query information about your fleet using natural language questions. In this post, I’ll explain how the Pure1 Manage MCP Server works. The first section will explain MCP in general, and the second section will explain how to use our specific server. Feel free to skip to the Quick Start section if you’re already familiar with MCP and just need the parameters to plug into your host. What is MCP? MCP stands for "Model Context Protocol," and it's a way for users to connect their AI applications to external systems using tool calls. MCP tools are fundamentally rooted in application programming interfaces (APIs). An API is a set of rules and protocols that allows different software applications to communicate with each other. It acts as an intermediary, enabling one piece of software (the client) to request information or functionality from another piece of software (the server) without needing to know the server's internal workings. For instance, when you check the weather on your phone, the weather app uses an API to send a request to a weather service, which then returns the current weather data. AI applications have trouble making API calls directly because APIs are designed for completeness and correctness, not for an LLM to use easily. When an AI application wants to use an external system to handle a user’s request, it uses the MCP protocol to make a tool call. The AI (client) requests a function (the tool) from an external system (the server), and the system executes the function and returns a result. This makes MCP a system that standardizes and mediates API-like interactions, allowing AI models to leverage external, real-world capabilities. For more information, see this article on the MCP website: “What is the Model Context Protocol (MCP)?” How can customers benefit from the Pure1 Manage MCP Server? The Pure1 Manage MCP Server enables customers to securely integrate AI assistants, copilots, and agentic systems with live Pure1 telemetry and operational data—without building custom API integrations. It transforms Pure1 from a dashboard-centric experience into an AI-accessible platform, enabling natural language interaction, contextual automation, and real-time operational intelligence. Customers benefit from faster AI integration, reduced engineering effort, preserved security controls, and improved decision velocity across hybrid environments. What types of customer workflows are best suited for MCP? The Pure1 Manage MCP Server is particularly well-suited for agentic and AI-driven workflows, including: Fleet telemetry integration with customer copilots Expose Pure1 telemetry—arrays, volumes, workloads, metrics, and alerts—into internal copilots, chatbots, or AI platforms via MCP endpoints. Value: Unified operational visibility across hybrid and multi-platform environments Automation with context awareness Use MCP to validate storage state, health, performance, or capacity before executing provisioning, backup, or disaster recovery workflows. Value: Safer automation with contextual validation, reduced execution errors, and greater rollback confidence Hybrid cloud observability Correlate Everpure array performance and capacity metrics with application, VM, container, or cloud telemetry across environments. Value: Faster troubleshooting and improved end-to-end performance insights Conversational operational analytics Enable operators to ask real-time natural language questions, such as: “Which arrays are nearing capacity risk?” “Show me the top latency spikes in the last 24 hours.” “Summarize all critical alerts across regions.” Value: Rapid insight without navigating dashboards or exporting reports What the Pure1 Manage MCP Server can do The Pure1 Manage MCP Server exposes the Pure1 Manage REST APIs to AI applications. This means you can ask ad hoc questions about your fleet using natural language, and your AI application will use the server’s tools to query the answer. No programming required. Example conversation Here’s an example conversation demonstrating the use of the Pure1 Manage MCP Server: User: "What is the name of my oldest FlashArray and what is its serial number?" Chatbot:Processing Request: The AI recognizes this as a request for fleet information requiring a tool call. Tool Invocation: Get_hardware Response: The oldest FlashArray in your fleet is named FA-Prod-01 with serial number FA10123456. User: "Can you show me the average I/O latency for that array over the last 7 days?" Chatbot: Let me first check the available latency metrics, then query the historical data: Tool Invocation: Get_metrics Tool Invocation: Get_metrics_history Response: The average I/O latency for FA-Prod-01 (serial FA10123456) over the last 7 days was 0.385 milliseconds. User: "How many arrays do I have in total that are still under warranty?" Chatbot: Processing Request: The AI recognizes this as a request for a count of arrays filtered by warranty status. Tool Invocation: Get_array_support_contracts Response: You currently have 12 arrays in your fleet that are still under active warranty. Quick start Step 1: Register an API key in Pure1 Manage The Pure1 Manage MCP Server leverages the Pure1 Manage REST APIs. In order to access those APIs, you need to register an API key in Pure1 Manage. To do that, follow the directions in the The Pure1® REST API introductory blog post. After going through the instructions, you will have an application id and a private key file, which will be used to generate an access token to access the MCP server in step 2. Step 2: Set up the pure1_token_factory.py script Prerequisites: you need Python 3.12 or greater to run the script. Download pure1_token_factory.zip. Unzip the archive. Go to the unzipped folder in your command-line terminal. Optional but recommended: create and activate a Python virtual environment: python3 -m venv .venv source .venv/bin/activate Install the requirements: pip3 install -r requirements.txt. Run python3 pure1_token_factory.py <application_id> <private_key_file> Copy the generated access token from the script output for the next step. Step 3: Add remote MCP server to your AI application Follow the directions for your AI application to add a remote MCP server (see the Pure1 Manage MCP Server User Guide for instructions for specific chatbots). In general, they need the following information: Remote MCP Server address: https://api.pure1.purestorage.com/mcp Authorization type: header Header name: Authorization Header value: Bearer <access-token> Important: <access-token> is just a placeholder for the access token you generated in step 2. The actual header value should look something like “Bearer eyJ0eXAiO…” Important: you need to generate a new access token every 10 hours and copy it into your AI application You’ll need to run pure1_token_factory.py to generate a new access token every 10 hours, and manually copy the access token into your AI application’s config. Claude Desktop instructions Claude Desktop is a special case because it doesn’t let you set the Authorization header directly. You have to run the mcp-remote local MCP server and configure that to use the Pure1 Manage remote MCP server. Prerequisites You need to have Node.js version 18 or newer installed on your system. Configuration In Claude Desktop, go to Settings > Developer, and click Edit Config. Open the claude_desktop_config.json file in a plain-text editor like VS Code. Configure the mcp-remote server, which is necessary to pass the Authorization header to the Pure1 Manage MCP Server. Paste the token into the configuration file, then restart Claude Desktop. { "mcpServers": { "Pure1 API": { "command": "npx", "args": [ "-y", "mcp-remote", "https://api.pure1.purestorage.com/mcp", "--header", "Authorization:${AUTHORIZATION_HEADER}" ], "env": { "AUTHORIZATION_HEADER": " Bearer <paste access token here>" } } } Note: there might be other configuration options in this file. Be sure to leave them unchanged, and only insert the Pure1 API config in the mcpServers section. The space in the AUTHORIZATION_HEADER environment variable is important. It's there to work around a bug in Windows argument parsing. Please note that: The first time it uses a tool, it will ask you for permission. You can grant permission to all tools at once by going to Customize > Connectors > Pure1 API, and selecting Always Allow under Other tools. For more detailed instructions from Anthropic, please refer to: Connect to local MCP servers - Model Context Protocol.
rcagle
8 days ago Place User Blogs
122Views
0likes
0Comments
Why Object Storage Still Matters
In Part 2, I wrote a line that, at the time, felt almost like a side comment — something I typed without fully appreciating how much it would change the direction of the story: “BREAKING NEWS: The FlashArray now supports Object??? What in the world? I may need to write an article about that!!” That reaction wasn’t planned, and it definitely wasn’t me being clever. It was me looking at the GUI and thinking, “that can’t be right… can it?” It didn’t line up with how I’ve been modeling storage architectures in my head for years, which usually means one of two things: either something fundamentally changed… or I’ve been confidently wrong about part of this for a while. And if I’m being completely honest, there was also a second reaction happening in parallel — one that I didn’t write down at the time because it sounded slightly ridiculous even in my own head: “Wait… do I actually understand why object storage exists in the first place? And more importantly… what exactly was wrong with files?” That’s the part nobody likes to admit out loud. We’ve all spent years confidently explaining block, file, and object as if we were born with that knowledge, when in reality most of us learned it incrementally, retroactively, and with just enough conviction to sound credible in front of a customer. Object storage, in particular, has always carried this aura of inevitability — like of course it’s better, of course it scales, of course it’s what modern applications need — without always forcing us to question why the previous model stopped being enough. Because for as long as most of us have been designing infrastructure, object storage has not simply been another protocol layered onto an existing system. It has represented a fundamentally different way of organizing and accessing data, one that required its own architectural approach, its own scaling model, and, more often than not, its own dedicated platform. The separation between block, file, and object was not arbitrary; it was a reflection of how deeply different those paradigms were in terms of metadata handling, access patterns, and performance expectations. This is precisely why platforms such as Everpure FlashBlade exist in the first place. They were not created as extensions of traditional storage systems but as purpose-built architectures designed to treat unstructured data — and particularly object data — as a first-class citizen. The use of distributed metadata services, sharded across independent nodes, combined with a key-value store storage model, allows such systems to achieve levels of parallelism and throughput that simply cannot be replicated within a controller-based design. In that context, object storage is not something that is “added” to the system; it is the system. Which is why seeing S3 support appear on FlashArray required a pause. Not excitement. Not skepticism alone. Something closer to intellectual friction. Reconciling Two Architectural Worlds The most important step in understanding what FlashArray has introduced is to resist the temptation to treat it as a direct comparison to FlashBlade. These aren’t two different ways of solving the same problem. They’re two different answers to two different problems—and pretending otherwise is where people get themselves into trouble. FlashBlade is built for object, not adapted to it. S3 talks directly to a distributed engine that thinks in objects, not files pretending to be objects. Metadata is spread across blades instead of becoming a centralized choke point, and the whole system scales the way modern workloads actually need it to. There’s no file system layer to fight with, no directory structure to navigate, no POSIX semantics getting in the way. It just does what you’d expect when you remove all of that: it goes fast, it scales cleanly, and it keeps up with workloads like HPC, AI and analytics without breaking a sweat. FlashArray takes a very different path, and in reality, it’s not what most people expect. It doesn’t try to reinvent itself as an object platform, and it doesn’t throw an S3 gateway in front of the array and call it a day. With Purity 6.10.5+, S3 just shows up as another protocol the system understands, right next to block and file. That distinction matters more than it seems. This isn’t something duct-taped on the side — it’s part of the same control plane, the same data path, the same system you’ve already been running. But let’s not pretend it turned into FlashBlade overnight. This is still a controller-driven architecture. The primary controller does the heavy lifting — handling requests, authenticating them, coordinating operations — before anything actually hits the storage engine. Which means it behaves differently, especially as workloads scale. So it ends up in this interesting middle ground. Not a native object system in the pure sense, but not a hack either. Just a different way of exposing what’s already there. The Translation Layer and Its Consequences It would be irresponsible to discuss FlashArray S3 without explicitly addressing the implications of this design. Even with its native integration into Purity, S3 operations are still subject to the realities of a controller-bound architecture. Every request must be processed, authenticated, and coordinated before it is executed, introducing a measurable difference in behavior compared to both native block operations and distributed object systems. The most immediate effect is latency. While FlashArray continues to deliver sub-150 microsecond performance for block workloads, S3 operations typically operate at higher latencies (in 1 millisecond range) due to the additional processing steps involved. This is not a flaw; it is the natural outcome of introducing a protocol that was designed for scale and flexibility into a system optimized for low-latency transactional workloads. Metadata handling further reinforces this distinction. FlashBlade distributes metadata across its architecture, enabling massive parallelism and consistent performance at scale. FlashArray processes metadata through its controller framework, which introduces natural serialization points under high concurrency. As workloads become increasingly metadata-heavy — particularly with small objects — this difference becomes more pronounced. The system also enforces clearly defined operational limits to maintain predictable performance. As of Purity 6.10.5+, FlashArray supports up to 250 S3 buckets per array and a maximum of 1,000,000 objects per bucket. FlashArray Object Store Limits Object storage operates at the array scope and does not integrate with multi-tenancy or “realms”, which has implications for service provider models and strict tenant isolation requirements. These constraints are not arbitrary limitations; they are guardrails that ensure the system behaves consistently within its architectural boundaries. Where the Architecture Becomes Secondary Having established those boundaries, the conversation naturally shifts from “how it works” to “why it matters”. In many enterprise environments, particularly within SLED organizations, the challenge is not achieving exabyte-scale throughput or supporting billions of objects. The challenge is delivering capabilities in a way that is operationally sustainable, economically efficient, and aligned with existing infrastructure. This is where FlashArray’s approach becomes compelling. By exposing object storage within the same platform that already supports block and file workloads, it eliminates the need to introduce a separate system, a separate operational model, and a separate set of dependencies. The same management interface, the same automation framework, and the same data services extend across all protocols. More importantly, object data inherits the full set of Purity capabilities. Global inline deduplication and compression apply to S3 workloads, significantly improving storage efficiency compared to many object-native platforms. SafeMode snapshots extend immutability to object storage, providing a critical layer of protection against ransomware. ActiveCluster, combined with ActiveDR, enables a three-site resilience model that ensures data availability across multiple locations with zero RPO between primary sites. These are not incremental improvements. They represent a shift in how object storage can be consumed within an enterprise. Practical Use Cases in a Unified Model When viewed through this lens, the use cases for FlashArray S3 become both clear and grounded in reality. Development and Staging Environments Some applications rely on S3 APIs but do not require massive scale, FlashArray provides a consistent and integrated object interface without introducing additional infrastructure. Developers can build and test against a familiar model while remaining within the same operational environment. Backup and Recovery Workflows FlashArray S3 enables modern data protection strategies that leverage object storage while benefiting from flash performance, deduplication, and indelible snapshots. This combination improves both recovery times and storage efficiency. Tier-two repositories and application-integrated storage represent another natural fit. Workloads such as document management systems, logs, and archival data often require object semantics but do not justify the higher cost of a dedicated object platform. Consolidating these workloads onto FlashArray simplifies operations while maintaining reliability and performance. Where the Boundaries Still Matter None of this diminishes the importance of selecting the appropriate platform for workloads that demand a different architecture. High-performance AI pipelines, large-scale analytics environments, and use cases requiring massive parallelism remain firmly within the domain of FlashBlade. The ability to scale performance linearly, distribute metadata across many nodes, and support billions of objects is not optional in these scenarios — it is essential. What has changed is not the relevance of those systems, but the necessity of deploying them for every object storage use case. A Subtle but Significant Shift The introduction of S3 on FlashArray does not represent a replacement of one architecture with another. It represents a convergence of capabilities within a unified operational framework. Object storage, in this model, is no longer a destination that requires its own platform. It becomes a capability — one of several ways to access and manage data within the same system. That shift is easy to overlook, but its implications are significant. It allows organizations to design around outcomes rather than protocols, to reduce complexity without sacrificing capability, and to align infrastructure more closely with the needs of modern applications. Closing Reflection Looking back at that line in Part 2, it is clear that the reaction was not just about a new feature appearing in the interface. It was about the recognition — however incomplete at the time — that something foundational was beginning to change. Object storage did not suddenly become simpler, nor did it lose the architectural complexity that defines it. What changed is where it lives. And once that becomes clear, you start asking a slightly uncomfortable but very honest question: If this works… and it works well enough for most of what I actually need… why was I so convinced it had to live somewhere else in the first place? That is usually where the interesting work begins. Appreciate you reading. Dmitry Gorbatov © 2025 Dmitry Gorbatov | #dmitrywashere
dmitrywashere
10 days ago Place User Blogs
27Views
1like
0Comments
Boosting SQL Server Backup/Restore Performance: Threads and Parallelism
In this post, we’ll discuss day 1 tuning you can do on your database hosts to take full advantage of your new high-performance backup storage. We’ll go over a few tricks around database layout and backup configuration for maximum throughput, discuss some quirks with SMB, and finally discuss using S3 effectively.
markwdev
1 month ago Place User Blogs
93Views
1like
0Comments
🍀 Don’t Rely on Luck: A St. Patrick’s Day Reminder to Secure Your Fleet
St. Patrick’s Day is a celebration of luck, fortune, and four-leaf clovers—but when it comes to cybersecurity, luck is not a strategy. You cannot rely on chance to secure your environment. You need visibility, control, and proactive remediation. As threats continue to evolve and vulnerabilities are discovered across the industry, the most important first step in protecting your infrastructure is simple: Know exactly what you’re running. Step 1: Build a current, accurate fleet inventory The adage "You can't protect what you can't see" is a fundamental principle of cybersecurity. A comprehensive, real-time inventory of your storage fleet sets the foundation for security hygiene. That includes: Every array in your fleet Every active version of the Purity operating environment Exposure to known security vulnerabilities Identification of arrays that may require upgrades or patches The Everpure Pure1® Fleet Security Assessment Center provides this visibility in a single, centralized view: 🔗 Pure1 Fleet Security Assessment Center (login required) https://pure1.purestorage.com/app/dashboard/assessment/security This dashboard identifies: All Purity versions active in your fleet Arrays running non-recommended versions Potential exposure to known CVEs Security posture gaps requiring action Step 2: Understand vulnerability exposure Staying informed about known vulnerabilities is critical. The Everpure CVE Database provides transparent tracking of security advisories affecting our products: 🔗 Everpure CVE Database (login required) https://support.purestorage.com/bundle/z-kb-articles-cve/page/cve-database.html This resource allows you to: Review impacted Purity versions Understand severity and CVSS scoring Identify fixed or remediated versions Access mitigation guidance Step 3: Upgrade or patch—don’t wait If your fleet assessment identifies risk exposure, action is required. We strongly urge customers to ensure: All arrays are upgraded to the recommended fixed Purity versions OR Appropriate patches are applied to remediate identified vulnerabilities Security is not static. Staying current ensures: Reduced attack surface Stronger cryptographic protections Hardened operating environments Continued alignment with best practices Reinforce with security best practices Beyond version management, follow our published security guidance for both FlashArray™ and FlashBlade® platforms: FlashArray Security Best Practices (login required) https://support.purestorage.com/bundle/m_flasharray_security/page/FlashArray/FlashArray_Security/topics/c_flasharray_security_overview_best_practices.html FlashBlade Security Best Practices (login required) https://support.purestorage.com/bundle/m_security_resources/page/FlashBlade/FlashBlade_Security/topics/concept/c_purityfb_4.5_security_best_practices.html These white papers outline: Secure configuration recommendations Access control hardening Encryption best practices Monitoring and logging guidance Final thought On St. Patrick’s Day, luck may bring you a pot of gold. But in cybersecurity, luck only buys you time—and time runs out. A secure environment requires: A current fleet inventory Continuous vulnerability awareness Timely upgrades and patching Adherence to security best practices Don’t rely on luck to protect your data. Take control of your security posture today. Happy St. Patrick’s Day—and stay secure. 🍀💪
bcrandall
1 month ago Place User Blogs
77Views
0likes
0Comments
Ask Us Everything: Everpure Object — What You Need to Know
Why Object Exists (and Why It’s Different) Justin opened with a reset that resonated: file and object may both store unstructured data, but they are built on different assumptions. File storage evolved from human workflows — folders, directories, locking semantics, POSIX guarantees. That model works well for users and shared drives. But those same assumptions become friction at cloud scale. Object storage was built for machines. It uses a flat namespace, atomic operations, embedded metadata, and native versioning. That’s why modern applications — backup platforms, analytics engines, AI frameworks — increasingly request S3 buckets instead of file shares. It’s not that file storage is going away; it’s that machines prefer object. Scale: 3.8 Trillion Objects and Counting One of the standout moments was a validation that Everpure ran for a customer, which tested 3.8 trillion objects in a single bucket on FlashBlade. They didn’t stop because they hit a ceiling — they stopped because they ran out of time. That matters because unlimited scaling isn’t guaranteed in most on-prem object systems. Many legacy solutions quietly impose metadata or bucket limits that don’t surface until you’re deep into production. If your roadmap includes AI datasets, large backup repositories, analytics pipelines, or content delivery use cases, scale limits quickly become real-world constraints. Object for AI: Performance Has Changed the Conversation Using object for AI dominated the Q&A — and for good reason. Training workloads demand enormous throughput, especially for checkpointing bursts across large GPU clusters. Inference workloads are more latency-sensitive and read-heavy. FlashBlade’s architecture, including S3 over RDMA, separates metadata authentication from the data path and enables direct, high-throughput access to data nodes. The team referenced performance in the hundreds of GB/sec range on multi-chassis systems. Justin made an important observation: AI initially landed on file systems simply because object storage wasn’t considered performant enough. That assumption is changing rapidly. Object on FlashArray: The “Alongside Block” Story A lot of questions focused on object running on FlashArray — resiliency, performance expectations, and which workloads are a fit. Writes are acknowledged only after safe persistence, and standard object retry logic handles failure scenarios cleanly. So, you can be sure of data integrity, even if a controller fails. FlashArray Object is designed for smaller-scale S3 use cases: artifact repositories, container workloads, image stores, edge environments, and test/dev scenarios. FlashBlade remains the scale-out platform for massive object footprints. Over time, Everpure Fusion will increasingly abstract placement decisions so workloads land on the right platform without adding operational complexity. Data Reduction and Garbage Collection: The Hidden Advantages One of the more practical differentiators discussed was garbage collection. Many legacy object systems struggle with delete churn because of layered indirection — objects are marked, then nodes are marked, then underlying file systems are marked, then media eventually reclaims space. Because Everpure controls the stack end-to-end — logical object through physical media — reclamation is cohesive and efficient. Combined with always-on compression and similarity-based DeepReduce techniques, customers see meaningful space savings without sacrificing performance. Migration: It’s an Application Decision Perhaps the most important takeaway: moving from file to object isn’t a storage copy exercise. It’s an application transition. Backup software, artifact repositories, and analytics platforms increasingly support object natively. Let the application drive the migration instead of trying to brute-force a file-to-object copy. Object is growing quickly, but the shift doesn’t require abandoning everything at once. With FlashArray for edge and unified workloads, FlashBlade for scale-out performance, and Everpure Fusion tying it together, we are building a platform where object can grow naturally alongside block — not replace it overnight. If you have follow-up questions, bring them into the Pure Community. The conversation around object is only getting bigger.
Flashman
1 month ago Place User Blogs
30Views
1like
0Comments
Simplifying Observability: Native OpenTelemetry in Purity
As enterprises modernize and accelerate their infrastructure through automation, blind spots become more expensive. When systems move faster, teams need telemetry that’s reliable, portable, and easy to integrate across a heterogeneous stack. Pure Storage’s Enterprise Data Cloud vision reflects that shift: infrastructure that delivers cloud-like simplicity and speed while preserving the control, security, and performance enterprises expect. Fusion supports this by standardizing and scaling self-service workflows, turning storage into an on-demand platform. But faster operations require a stronger feedback loop. As automation increases, teams need confidence that systems remain healthy and predictable. That’s why consolidated observability is foundational. Instead of running separate monitoring tools per layer, organizations are centralizing telemetry into a single observability platform that can correlate signals end-to-end; from the end user’s experience (e.g. browser or mobile app), through the network and application code, all the way down to infrastructure like servers, databases, containers, and storage. This consolidation reduces redundant tools and fragmented dashboards while giving teams the correlated insights they need to resolve incidents faster and make better decisions. The Siloed Vendor Problem Yet achieving this unified vision has proven challenging. Traditional infrastructure vendors have long provided proprietary monitoring tools designed exclusively for their own products. A storage vendor offers one monitoring interface, the compute vendor another, and the network vendor yet another. Each tool uses different data formats, separate dashboards, and incompatible alerting mechanisms. For organizations running heterogeneous environments (which is nearly all of them), this creates an untenable situation. IT teams must context-switch between multiple tools, correlate data manually across platforms, and maintain expertise in numerous vendor-specific interfaces. When an application performance issue arises, determining whether the root cause lies in storage latency, network congestion, or compute resource exhaustion becomes an exercise in detective work across disconnected systems. The promise of consolidated observability cannot be realized with vendor-specific, siloed monitoring tools. A different approach is needed. The Open Standard Solution This challenge has driven the industry toward open, vendor-agnostic standards that enable telemetry interoperability. OpenMetrics emerged as one such standard, providing a common data model for exposing metrics (counters, gauges, and histograms) in a format that any observability platform can consume. By standardizing metric exposition, OpenMetrics reduced vendor lock-in and became foundational to Prometheus-based monitoring at scale. However, standardizing the format of metrics is only one part of what organizations need to make consolidated observability work in practice. Enterprises also need consistency in how telemetry is named, described, transported, and exported, so that infrastructure data can flow cleanly across heterogeneous environments without bespoke integrations. Enter OpenTelemetry, which expands on the same vendor-neutral principles to create a comprehensive observability framework. In other words, it helps ensure telemetry isn’t just emitted in a readable format, but is also structured and delivered in a way that remains portable across vendors and backends. Think of it as establishing the equivalent of a USB standard for telemetry data: any "device" (an application or infrastructure component) can plug into any "peripheral" (an observability platform) without requiring proprietary connectors. The primary benefit is profound: freedom from vendor lock-in. Organizations can choose best-of-breed observability platforms based on capabilities and cost rather than being constrained by what their infrastructure vendors support. The External Agent Bottleneck OpenTelemetry and OpenMetrics have made consolidated observability technically feasible, but most storage vendors have adopted these standards through what can only be described as a "bolt-on" approach. This forces customers to manage a complex chain of external agents, sidecars, or dedicated VMs, just to get telemetry from their platforms visualized onto their dashboards. This presents a problem that’s two-fold: Operational Overhead: Instead of simply consuming data, IT teams are burdened with sizing, patching, and troubleshooting the monitoring infrastructure itself. New Failure Modes: If an agent crashes or becomes misconfigured, visibility into critical infrastructure disappears precisely when it's needed most. Teams find themselves monitoring their monitoring infrastructure; a meta-problem that defeats the original purpose. The Native Integration Imperative In the Pure Storage platform, observability is a first-class capability instead of an afterthought. Thus, Pure Storage has taken a different path: an OpenTelemetry collector embedded into Purity OS. Instead of asking customers to deploy and maintain external agents, exporters, or intermediary infrastructure, Pure Storage platforms will now expose telemetry in standardized OpenTelemetry format as an intrinsic platform capability. The result is sending storage telemetry directly into any OpenTelemetry-compatible Observability platform-of-choice (eg., Datadog, Dynatrace, Splunk, Grafana, etc.). Fig. Numbers represent the sequence of steps in the workflow Pure Storage’s commitment has always been simplicity. Native OpenTelemetry in Purity OS extends that principle to observability: less integration friction, fewer moving parts, and more time spent acting on insight instead of maintaining the pipeline. More information on the native integration of OpenTelemetry Collector within Purity//FB can be found here. Purity//FA to follow soon.
sananta
2 months ago Place User Blogs
502Views
0likes
0Comments
Ask Us Everything: Evergreen//One™ Edition — What the Community Learned
A recent Ask Us Everything (AUE) session on Pure Storage Evergreen//One™ was a lively, deeply technical conversation—and exactly the kind of dialogue that makes the Pure Community special. Here are some of the biggest takeaways, organized around the questions asked and the insights that followed.
kevinr
2 months ago Place User Blogs
215Views
0likes
0Comments
Stop Prompting, Start Context Engineering
This blog post argues that Context Engineering is the critical new discipline for building autonomous, goal-driven AI agents. Since Large Language Models (LLMs) are stateless and forget information outside their immediate context window, Context Engineering focuses on assembling and managing the necessary information—such as session history, long-term memory (embeddings, RAG indexes), and tool outputs—for the agent every single turn. The post asserts that storage, not the LLM or the prompt, is the primary performance bottleneck for AI at scale. The speed of the underlying storage architecture dictates the agent's responsiveness because it must quickly retrieve and persist context data repeatedly.
kgautam
3 months ago Place User Blogs
108Views
3likes
0Comments
How to Use Logstash to Send Directly to an S3 Object Store
This article originally appeared on Medium.com and has been republished with permission from the author. To aggregate logs directly to an object store like FlashBlade, you can use the Logstash S3 output plugin. Logstash aggregates and periodically writes objects on S3, which are then available for later analysis. This plugin is simple to deploy and does not require additional infrastructure and complexity, such as a Kafka message queue. A common use-case is to leverage an existing Logstash system filtering out a small percentage of log lines that are sent to an Elasticsearch cluster. A second output filter to S3 would keep all log lines in raw (un-indexed) form for ad-hoc analysis and machine learning. The diagram below illustrates this architecture, which balances expensive indexing and raw data storage. Logstash Configuration An example Logstash config highlights the parts necessary to connect to FlashBlade S3 and send logs to the bucket “logstash,” which should already exist. The input section is a trivial example and should be replaced by your specific input sources (e.g., filebeats). input { file { path => [“/home/logstash/testdata.log”] sincedb_path => “/dev/null” start_position => “beginning” } } filter { <code”>} output { stdout { codec => rubydebug } s3{ access_key_id => “XXXXXXXX” secret_access_key => “YYYYYYYYYYYYYY” endpoint => “https://10.62.64.200" bucket => “logstash” additional_settings => { “force_path_style” => true } time_file => 5 codec => “plain” } } Note that the force_path_style setting is required; configuring a FlashBlade endpoint needs path style addressing instead of virtual host addressing. Path-style addressing does not require co-configuration with DNS servers and therefore is simpler in on-premises environments. For a more secure option, instead of specifying the access/secret key in the pipeline configuration file, they should also be specified as environment variables AWS_ACCESS_KEY and AWS_SECRET_ACCESS_KEY. Logstash can trade off efficiency of writing to S3 with the possibility of data loss through the two configuration options “time_file” and “size_file,” which control the frequency of flushing lines to an object. Larger flushes result in more efficient writes and object sizes, but result in a larger window of possible data loss if a node fails. The maximum amount of data loss is the smaller of “size_file” and “time_file” worth of data. Validation Test To test the flow of data through Logstash to FlashBlade S3, I use the public docker image for Logstash. Starting with the configuration file shown above, customize the fields for your specific FlashBlade environment and place them in ${PWD}/pipeline/ directory. We then volume-mount the configuration into the Logstash container at runtime. Start a Logstash server as a Docker container as follows: > docker run --rm -it -v ${PWD}/pipeline/:/usr/share/logstash/pipeline/ -v ${PWD}/logs/:/home/logstash/ docker.elastic.co/logstash/logstash:7.6.0 Note that I also volume-mounted the ${PWD}/logs/ directory, which is where Logstash will look for incoming data. In a second terminal, I generate synthetic data with the flog tool, writing into the shared “logs/” directory: > docker run -it --rm mingrammer/flog > logs/testdata.log Logstash will automatically pick up this new log data and start writing to S3. Then look at the output on S3 with s5cmd; in my example the result is three objects written (5MB, 5MB, and 17KB in size). > s5cmd ls s3://logstash/ + 2020/02/28 04:09:42 17740 ls.s3.03210fdc-c108–4e7d-8e49–72b614366eab.2020–02–28T04.04.part28.txt + 2020/02/28 04:10:21 5248159 ls.s3.5fe6d31b-8f61–428d-b822–43254d0baf57.2020–02–28T04.10.part30.txt + 2020/02/28 04:10:21 5256712 ls.s3.9a7f33e2-fba5–464f-8373–29e9823f5b3a.2020–02–28T04.09.part29.txt Making Use of Logs Data with Spark In Pyspark, the log lines can be loaded for a specific date as follows: logs = sc.textFile(“s3a://logstash/ls.s3.*.2020–02–29*.txt”) Because the ordering of the key places the uid before the date, each time a new Spark dataset is created it will require enumerating all objects. This is an unfortunate consequence of not having the key prefixes in the right order for sorting by date. Once loaded, you can perform custom parsing and analysis, use the Spark-Elasticsearch plugin to index the full set of data, or start machine learning experiments with SparkML.
catud
4 months ago Place User Blogs
66Views
0likes
0Comments
OT: The Architecture of Interoperability
In previous post, we explored the fundamental divide between Information Technology (IT) and Operational Technology (OT). We established that while IT manages data and applications, OT controls the physical heartbeat of our world from factory floors to water treatment plants. In this post we are diving deeper into the bridge that connects them: Interoperability. As Industry 4.0 and the Internet of Things (IoT) accelerate, the "air gap" that once separated these domains is evolving. For modern enterprises, the goal isn't just to have IT and OT coexist, but to have them communicate seamlessly. Whether the use-cases are security, real time quality control, or predictive maintenance, to name a few, this is why interoperability becomes the critical engine for operational excellence. The Interoperability Architecture Interoperability is more than just connecting cables; it’s about creating a unified architecture where data flows securely between the shop floor and the “top floor”. In legacy environments, OT systems (like SCADA and PLCs) often run on isolated, proprietary networks that don’t speak the same language as IT’s cloud-based analytics platforms. To bridge this, a robust interoperability architecture is required. This architecture must support: Industrial Data Lake: A single storage platform that can handle block, file, and object data is essential for bridging the gap between IT and OT. This unified approach prevents data silos by allowing proprietary OT sensor data to coexist on the same high-performance storage as IT applications (such as ERP and CRM). The benefit is the creation of a high-performance Industrial Data Lake, where OT and IT data from various sources can be streamed directly, minimizing the need for data movement, a critical efficiency gain. Real Time Analytics: OT sensors continuously monitor machine conditions including: vibration, temperature, and other critical parameters, generating real-time telemetry data. An interoperable architecture built on high performance flash storage enables instant processing of this data stream. By integrating IT analytics platforms with predictive algorithms, the system identifies anomalies before they escalate, accelerating maintenance response, optimizing operations, and streamlining exception handling. This approach reduces downtime, lowers maintenance costs, and extends overall asset life. Standards Based Design: As outlined in recent cybersecurity research, modern OT environments require datasets that correlate physical process data with network traffic logs to detect anomalies effectively. An interoperable architecture facilitates this by centralizing data for analysis without compromising the security posture. Also, IT/OT convergence requires a platform capable of securely managing OT data, often through IT standards. An API-First Design allows the entire platform to be built on robust APIs, enabling IT to easily integrate storage provisioning, monitoring, and data protection into standard, policy-driven IT automation tools (e.g., Kubernetes, orchestration software). Pure Storage addresses these interoperability requirements with the Purity operating environment, which abstracts the complexity of underlying hardware and provides a seamless, multiprotocol experience (NFS, SMB, S3, FC, iSCSI). This ensures that whether data originates from a robotic arm or a CRM application, it is stored, protected, and accessible through a single, unified data plane. Real-World Application: A Large Regional Water District Consider a large regional water district, a major provider serving millions of residents. In an environment like this, maintaining water quality and service reliability is a 24/7 mission-critical OT function. Its infrastructure relies on complex SCADA systems to monitor variables like flow rates, tank levels, and chemical compositions across hundreds of miles of pipelines and treatment facilities. By adopting an interoperable architecture, an organization like this can break down the silos between its operational data and its IT capabilities. Instead of SCADA data remaining locked in a control room, it can be securely replicated to IT environments for long-term trending and capacity planning. For instance, historical flow data combined with predictive analytics can help forecast demand spikes or identify aging infrastructure before a leak occurs. This convergence transforms raw operational data into actionable business intelligence, ensuring reliability for the communities they serve. Why We Champion Compliance and Governance Opening up OT systems to IT networks can introduce new risks. In the world of OT, "move fast and break things" is not an option; reliability and safety are paramount. This is why Pure Storage wraps interoperability in a framework of compliance and governance, not limited to: FIPS 140-2 Certification & Common Criteria: We utilize FIPS 140-2 certified encryption modules and have achieved Common Criteria certification. Data Sovereignty: Our architecture includes built-in governance features like Always-On Encryption and rapid data locking to ensure compliance with domestic and international regulations, protecting sensitive data regardless of where it resides. Compliance: Pure Fusion delivers policy defined storage provisioning, automating the deployment with specified requirements for tags, protection, and replication. By embedding these standards directly into the storage array, Pure Storage allows organizations to innovate with interoperability while maintaining the security posture that critical OT infrastructure demands. Next in the series: We will explore further into IT/OT interoperability and processing of data at the edge. Stay tuned!
ebiser
4 months ago Place User Blogs
86Views
0likes
0Comments