Pittsburgh PUG - Launch Party @ River's Casino Drum Bar
You’re Invited! PUG - Time to Launch REGISTER NOW --> Celebrating Pure Storage + Nutanix + Expedient Join us for a special Pure User Group event as we celebrate the launch of two of the industry’s most loved technologies: Pure Storage and Nutanix are coming together in a powerful new way. Even better: Expedient becomes the FIRST Cloud Service Provider to bring this combined solution to market. This event is all about bringing our Pittsburgh-area community together to learn, connect, and celebrate a major milestone in the hybrid cloud and on-prem cloud ecosystem. What You’ll Experience A deep dive into the new Pure Storage + Nutanix integration How Expedient is delivering it as a fully managed cloud service Real-world use cases for cloud-smart modernization Customer-driven conversation, not vendor slides Networking with peers, experts, and the local PUG community Food, drinks, and launch-party fun Why This Matters This three-way partnership brings customers NVMe-fast, always-on performance Effortless scalability and hybrid cloud freedom A cloud service built for simplicity and resiliency Lower operational overhead no firefighting, no forklift upgrades It’s the stack that “just works,” so your teams can focus on innovation instead of maintenance. REGISTER NOW -->37Views1like0CommentsOT: The Architecture of Interoperability
In previous post, we explored the fundamental divide between Information Technology (IT) and Operational Technology (OT). We established that while IT manages data and applications, OT controls the physical heartbeat of our world from factory floors to water treatment plants. In this post we are diving deeper into the bridge that connects them: Interoperability. As Industry 4.0 and the Internet of Things (IoT) accelerate, the "air gap" that once separated these domains is evolving. For modern enterprises, the goal isn't just to have IT and OT coexist, but to have them communicate seamlessly. Whether the use-cases are security, real time quality control, or predictive maintenance, to name a few, this is why interoperability becomes the critical engine for operational excellence. The Interoperability Architecture Interoperability is more than just connecting cables; it’s about creating a unified architecture where data flows securely between the shop floor and the “top floor”. In legacy environments, OT systems (like SCADA and PLCs) often run on isolated, proprietary networks that don’t speak the same language as IT’s cloud-based analytics platforms. To bridge this, a robust interoperability architecture is required. This architecture must support: Industrial Data Lake: A single storage platform that can handle block, file, and object data is essential for bridging the gap between IT and OT. This unified approach prevents data silos by allowing proprietary OT sensor data to coexist on the same high-performance storage as IT applications (such as ERP and CRM). The benefit is the creation of a high-performance Industrial Data Lake, where OT and IT data from various sources can be streamed directly, minimizing the need for data movement, a critical efficiency gain. Real Time Analytics: OT sensors continuously monitor machine conditions including: vibration, temperature, and other critical parameters, generating real-time telemetry data. An interoperable architecture built on high performance flash storage enables instant processing of this data stream. By integrating IT analytics platforms with predictive algorithms, the system identifies anomalies before they escalate, accelerating maintenance response, optimizing operations, and streamlining exception handling. This approach reduces downtime, lowers maintenance costs, and extends overall asset life. Standards Based Design: As outlined in recent cybersecurity research, modern OT environments require datasets that correlate physical process data with network traffic logs to detect anomalies effectively. An interoperable architecture facilitates this by centralizing data for analysis without compromising the security posture. Also, IT/OT convergence requires a platform capable of securely managing OT data, often through IT standards. An API-First Design allows the entire platform to be built on robust APIs, enabling IT to easily integrate storage provisioning, monitoring, and data protection into standard, policy-driven IT automation tools (e.g., Kubernetes, orchestration software). Pure Storage addresses these interoperability requirements with the Purity operating environment, which abstracts the complexity of underlying hardware and provides a seamless, multiprotocol experience (NFS, SMB, S3, FC, iSCSI). This ensures that whether data originates from a robotic arm or a CRM application, it is stored, protected, and accessible through a single, unified data plane. Real-World Application: A Large Regional Water District Consider a large regional water district, a major provider serving millions of residents. In an environment like this, maintaining water quality and service reliability is a 24/7 mission-critical OT function. Its infrastructure relies on complex SCADA systems to monitor variables like flow rates, tank levels, and chemical compositions across hundreds of miles of pipelines and treatment facilities. By adopting an interoperable architecture, an organization like this can break down the silos between its operational data and its IT capabilities. Instead of SCADA data remaining locked in a control room, it can be securely replicated to IT environments for long-term trending and capacity planning. For instance, historical flow data combined with predictive analytics can help forecast demand spikes or identify aging infrastructure before a leak occurs. This convergence transforms raw operational data into actionable business intelligence, ensuring reliability for the communities they serve. Why We Champion Compliance and Governance Opening up OT systems to IT networks can introduce new risks. In the world of OT, "move fast and break things" is not an option; reliability and safety are paramount. This is why Pure Storage wraps interoperability in a framework of compliance and governance, not limited to: FIPS 140-2 Certification & Common Criteria: We utilize FIPS 140-2 certified encryption modules and have achieved Common Criteria certification. Data Sovereignty: Our architecture includes built-in governance features like Always-On Encryption and rapid data locking to ensure compliance with domestic and international regulations, protecting sensitive data regardless of where it resides. Compliance: Pure Fusion delivers policy defined storage provisioning, automating the deployment with specified requirements for tags, protection, and replication. By embedding these standards directly into the storage array, Pure Storage allows organizations to innovate with interoperability while maintaining the security posture that critical OT infrastructure demands. Next in the series: We will explore further into IT/OT interoperability and processing of data at the edge. Stay tuned!30Views0likes0CommentsAccelerating AI Delivery with Cloud Native Tooling
As AI evolves from simple model inference to agentic systems that act autonomously and interact across services, delivering these workloads efficiently has become increasingly complex. This session explores how cloud-native technologies enable scalable, secure, and observable deployment of modern AI agents. We’ll examine the core characteristics of agentic workloads, the orchestration and networking patterns they require, and how cloud-native tooling accelerates experimentation, delivery, and reliability.20Views0likes0CommentsUnderstanding Deduplication Ratios
It’s super important to understand where deduplication ratios, in relation to backup applications and data storage, come from. Deduplication prevents the same data from being stored again, lowering the data storage footprint. In terms of hosting virtual environments, like FlashArray//X™ and FlashArray//C™, you can see tremendous amounts of native deduplication due to the repetitive nature of these environments. Backup applications and targets have a different makeup. Even still, deduplication ratios have long been a talking point in the data storage industry and continue to be a decision point and factor in buying cycles. Data Domain pioneered this tactic to overstate its effectiveness, leaving customers thinking the vendor’s appliance must have a magic wand to reduce data by 40:1. I wanted to take the time to explain how deduplication ratios are derived in this industry and the variables to look for in figuring out exactly what to expect in terms of deduplication and data footprint. Let’s look at a simple example of a data protection scenario. Example: A company has 100TB of assorted data it wants to protect with its backup application. The necessary and configured agents go about doing the intelligent data collection and send the data to the target. Initially, and typically, the application will leverage both software compression and deduplication. Compression by itself will almost always yield a decent amount of data reduction. In this example, we’ll assume 2:1, which would mean the first data set goes from 100TB to 50TB. Deduplication doesn’t usually do much data reduction on the first baseline backup. Sometimes there are some efficiencies, like the repetitive data in virtual machines, but for the sake of this generic example scenario, we’ll leave it at 50TB total. So, full backup 1 (baseline): 50TB Now, there are scheduled incremental backups that occur daily from Monday to Friday. Let’s say these daily changes are 1% of the aforementioned data set. Each day, then, there would be 1TB of additional data stored. 5 days at 1TB = 5TB. Let’s add the compression in to reduce that 2:1, and you have an additional 2.5TB added. 50TB baseline plus 2.5TB of unique blocks means a total of 52.5TB of data stored. Let’s check the deduplication rate now. 105TB/52.5TB = 2x You may ask: “Wait, that 2:1 is really just the compression? Where is the deduplication?” Great question and the reason why I’m writing this blog. Deduplication prevents the same data from being stored again. With a single full backup and incremental backups, you wouldn’t see much more than just the compression. Where deduplication measures impact is in the assumption that you would be sending duplicate data to your target. This is usually discussed as data under management. Data under management is the logical data footprint of your backup data, as if you were regularly backing up the entire data set, not just changes, without deduplication or compression. For example, let’s say we didn’t schedule incremental backups but scheduled full backups every day instead. Without compression/deduplication, the data load would be 100TB for the initial baseline and then the same 100TB plus the daily growth. Day 0 (baseline): 100TB Day 1 (baseline+changes): 101TB Day 2 (baseline+changes): 102TB Day 3 (baseline+changes): 103TB Day 4 (baseline+changes): 104TB Day 5 (baseline+changes): 105TB Total, if no compression/deduplication: 615TB This 615TB total is data under management. Now, if we looked at our actual, post-compression/post-dedupe number from before (52.5TB), we can figure out the deduplication impact: 615/52.5 = 11.714x Looking at this over a 30-day period, you can see how the dedupe ratios can get really aggressive. For example: 100TB x 30 days = 3,000TB + (1TB x 30 days) = 3,030TB 3,030TB/65TB (actual data stored) = 46.62x dedupe ratio In summary: 100TB, 1% change rate, 1 week: Full backup + daily incremental backups = 52.5TB stored, and a 2x DRR Full daily backups = 52.5TB stored, and an 11.7x DRR That is how deduplication ratios really work—it’s a fictional function of “what if dedupe didn’t exist, but you stored everything on the disk anyway” scenarios. They’re a math exercise, not a reality exercise. Front-end data size, daily change rate, and retention are the biggest variables to look at when sizing or understanding the expected data footprint and the related data reduction/deduplication impact. In our scenario, we’re looking at one particular data set. Most companies will have multiple data types, and there can be even greater redundancy when accounting for full backups across those as well. So while it matters, consider that a bonus.54Views1like1CommentPure's Intelligent Control Plane: Powered by AI Copilot, MCP Connectivity and Workflow Orchestration
At Accelerate 2025, we announced two capabilities that change how you manage Pure Storage in your broader infrastructure: AI Copilot with Model Context Protocol (MCP) and Workflow Orchestration with production-ready templates. Here's what they do and why they matter. AI Copilot with MCP: Your Infrastructure, One Conversation The Problem Your infrastructure spans multiple platforms. Pure Storage managing your data, VMware running VMs, OpenShift handling containers, security tools monitoring threats, application platforms tracking performance - each with its own console, APIs, and workflows. When you need to migrate a VM or respond to a security incident, you're manually pulling information from each system, correlating it yourself, then executing actions across platforms. You become the integration layer. The Solution Pure1 now supports Model Context Protocol (MCP), taking Copilot from a suggestive assistant to an active operator. With MCP enabled, Copilot doesn’t just recommend - it acts. It serves as a secure bridge between natural language and your infrastructure, capable of fetching data, executing APIs, and orchestrating workflows across diverse systems. Here’s what makes this powerful: You deploy MCP servers within your environment—one for VMware, another for OpenShift, and others for the systems you use. Each server exposes your environment’s capabilities through a standard, interoperable protocol. Pure Storage AI Copilot connects seamlessly to these MCP servers, as well as to Pure services such as Data Intelligence, Workflow Orchestration, and Portworx Monitoring, enabling unified and secure automation across your hybrid ecosystem. What You Can Connect You can deploy an MCP server on any system whether it’s your VMware environment, Kubernetes clusters, security platforms like CrowdStrike, databases, monitoring tools, or custom applications. Pure Storage AI Copilot connects to these servers under your control, securely combining their data with Pure Storage services to deliver richer insights and automation. Getting Started: If you have a use-case around MCP, please contact your Pure Storage account team. Workflow Orchestration: Deploy in Minutes, Not Months The Problem Building production-grade automation takes months. You need error handling, integration with multiple systems, testing for edge cases, documentation, ongoing maintenance. Most teams end up with half-finished scripts that only one person understands. The Solution We built workflow templates for common operations, tested them at scale, and made them available in Pure1. Install them, customize to your needs, and run them in minutes. Key Templates VMware to OpenShift Migration with Portworx Handles complete migration: extracts VM metadata, identifies backing Pure volumes, checks OpenShift capacity, configures vVols Datastore and DirectAccess, uses array-based replication, converts to Portworx format. Traditional migration takes hours for TB-scale VMs. This takes 20 to 30 minutes. SQL / Oracle Database Clone and Copy Automates cloning and copying of SQL Server and Oracle databases for dev/test or refresh needs. Instantly creates storage-efficient clones from snapshots, mounts them to target environments, and applies Pure-optimized settings. The hours-long manual process becomes a quick, consistent workflow completed in minutes Daily Fleet Health Check Scans all arrays for capacity trends, performance issues, protection gaps, hardware health.Posts summary to Slack. Proactive visibility without manually checking each array. Rubrik Threat Detection Response When Rubrik detects a threat, automatically tags affected Pure volumes, creates isolated immutable snapshots, and notifies the security team. Security events propagate to your storage layer automatically. How It Works Workflow Orchestration is a SaaS feature in Pure1. Deploy lightweight agents (Windows, Linux, or Docker) in your data center to execute workflows locally. Group agents together for high availability and governance controls. Integrations Native Pure Storage: Pure1 Connector for full API access, Fusion Connector for storage provisioning (works for Fusion and non-Fusion FlashArray/FlashBlade customers) Third-Party: ServiceNow, Slack, Google, Microsoft,CrowdStrike, HTTP/Webhooks, Pagerduty, Salesforce and more. The connector library continues expanding. Getting Started: Opt-in now in Pure1 - Workflow. Introductory offer available at this time. Check with your Pure account team if you have questions. How They Work Together At Accelerate 2025 in New York, we showcased this capability in action. Here's the scenario: an organization wants to migrate VMs to Kubernetes. Action-enabled Copilot orchestrates communication with Pure Storage appliances and services as well as third-party MCP servers to collect the required information for addressing a problem across a heterogeneous environment. With Pure1 MCP, AI Copilot, and Workflows, there's now a programmatic way to collect information from OpenShift MCP, VMware MCP, and Pure1 storage insights- then recommend an approach on what VMs to migrate based on your selection criteria. You prompt Copilot: "How can I move my VMs to OpenShift in an efficient way?" Copilot communicates across: Your VMware MCP server - to get VM specifications, current configurations, resource usage Your OpenShift MCP server - to check available cluster capacity, validate compatibility Portworx monitoring - to understand current storage performance Copilot reasons across all this information, identifies ideal VM candidates based on your criteria, and recommends the migration approach- which VMs to move, target configurations, and how to preserve policies. Then it can trigger the migration workflow, keeping you updated throughout the process. Why This Matters Storage Admins: Stop being the bottleneck. Enable self-service while maintaining governance. DevOps Teams: Deploy production-tested automation without writing code. Security Teams: Build automated response workflows spanning detection, isolation, and recovery. Infrastructure Leaders: Reduce operational overhead. Teams focus on strategy, not repetitive tasks. Get Started MCP Integration:If you have a use-case around MCP, please contact your Pure Storage account team.. Workflow Orchestration:Opt-in at Pure1 → Workflows. Learn More: Documentation in Pure1 or contact your Pure Storage account team. Pure1 evolved from a monitoring platform to an Intelligent Control Plane. AI Copilot reasons across your infrastructure. Workflow Orchestration executes. Together, they change how you manage data with Pure Storage.81Views2likes0CommentsWho's using Pure Protect?
Hey everyone, Just wondering if anyone else is using Pure Protect yet. We have gone through the quick start guide and have a VMWare to VMWare configuration setup. We have configured our first policy and group utilizing a test VM but it seems to be stuck in the protection phase. I would be very interested to hear what others have seen or experienced. -Charles114Views1like3CommentsPure Storage Delivers Critical Cyber Outcomes, Part Two: Fast Analytics
“We don’t have storage problems. We have outcome problems.” - Pure customer in a recent cyber briefing No matter what we are buying, what we are buying is a desired outcome. If you buy a car, you are buying some sort of outcome or multiple outcomes. Point A to Point B, comfort, dependability, seat heaters, or if you are like me, a real, live Florida Man, seat coolers! The same is true when solving for cyber outcomes, and often overlooked is a storage foundation to drive cyber resilience. A strong storage foundation improves data security, resilience and recovery. With these characteristics, organizations can recover in hours vs. days. Here are some top cyber resilience outcomes Pure Storage is delivering. Native, Layered Resilience Fast Analytics Rapid Restore Enhanced Visibility We tackled Layered Resilience in our first offering, but what about Fast Analytics? Fast Analytics refers to native log storage in an attempt to review and determine possible anomalies and other potential threats to an environment. This is a category of outcomes that has been moved, by the vendors themselves and, therefore, also customers, but is seeing a repatriation trend back to on premises. Why is repatriation occurring in this space? This is a trend that we are seeing in larger enterprises due to the rising ingest rates and runaway growth of logs occurring. It is important, more important than ever, to discover attacks as soon as possible. Rising costs of downtime and work time to recover are working hand in hand in making every attack more costly than the next attack. To discover anomalies quickly, logs must be interrogated as fast as possible. To keep up with this, vendor solutions have beefed up their compute functions in their cloud offerings. Next-Gen SIEM is moving the formerly, classic, static rules mode of their offerings to an AI-driven, adaptive set of rules, geared toward evolving on the fly, in order to detect issues as quickly as possible. To deliver that outcome, you need a storage platform to deliver the fastest possible reads allowed. As stated, vendors with their cloud offerings attempt to do this by raising compute performance. But what we see the enterprises dealing with is the rising costs of these solutions in the cloud. How is this affecting these customers? As organizations ingest more log and telemetry data (driven by cloud adoption, endpoint proliferation, and compliance), costs soar due to the vendor’s reliance on ingest-based and workload-based pricing. More data means larger daily ingestion, rapidly pushing customers into higher pricing tiers, resulting in substantial cost increases if volumes are not carefully managed. Increasing needs for real-time anomaly detection translate to greater compute demands and more frequent queries, which for workload-based models triggers faster consumption of compute credits and higher overall bills. To control costs, many organizations limit which data sources they ingest or perform data tiering, risking reduced visibility and slower detection for some threats. How does an on-premises solution relieve some of these issues? An on-premises solution, such as Pure Storage FlashBlade, offers the power of all-flash and fast read to provide faster detection of anomalies to support the dynamic aspects of next-gen SIEM tools, but also offer more control around storage growth and associated costs, without sacrificing needed outcomes. For example, our partnership with Splunk allows customers to retain more logs for richer analysis, run more concurrent queries in less time, and test new analysis and innovate faster. Visual 1: Snazzy, high level look at Fast Analytics with our technology alliance partners Customers at our annual user extravaganza, Accelerate, told us about their process of bringing their logs back on-prem, in order to address some of these issues. One customer in particular, FiServ, told their story in our Cyber Resilience breakout session, where we were speaking on what to do before, during, and after an attack, specifically in the area of visibility, where the race is on to identify threats faster. They told of their own desire to reign in the cost of growth, to regain control of their environment. There is nothing wrong with cloud solutions, but the economies of scaling those solutions have had real world consequences and bringing those workloads back on-prem, to a proven, predictable, platform for performance, is beginning to be a better long term strategy to battle the ongoing fight for cybersecurity and resilience. On-premises storage is a valuable tool for managing the financial impact of growing data ingestion and analytics needs, by supporting precision data management, retention policy enforcement, and infrastructure sizing, while reducing expensive cloud subscription fees for long-term, large-scale operations. Exit Question: Are you seeing these issues developing in your log strategies? Are you considering on-premises for your log workloads today? Jason Walker is a technical strategy director for cyber related areas at Pure Storage and a real, live, Florida Man. No animals or humans, nor the author himself, were injured in the creation of this post.41Views1like0CommentsConfiguring Apache Spark on FlashBlade, Part 3: Tuning for True Parallelism
This post will explore how to diagnose and resolve performance bottlenecks that are not related to storage I/O, ensuring you can take full advantage of the high-performance, disaggregated architecture of FlashBlade. We'll use a real-world scenario to illustrate how specific tuning can unlock massive parallelism.183Views4likes0Comments