Ask Us Everything: Evergreen//One™ Edition — What the Community Learned
A recent Ask Us Everything (AUE) session on Pure Storage Evergreen//One™ was a lively, deeply technical conversation—and exactly the kind of dialogue that makes the Pure Community special. Here are some of the biggest takeaways, organized around the questions asked and the insights that followed.1View0likes0CommentsWe are just one week away PUG#3
January 28th, the Cincinnati Pure User Group will be convening at Ace's Pickleball to discuss Enterprise file. We will be joined by Matt Niederhelman Unstructured Data Field Solutions Architect to help guide conversation and answer questions about what he is experiencing amongst other customers. Click the link below to register and come join us. Help us guide the conversation with your ideas for future topics. https://info.purestorage.com/2025-Q4AMS-COMREPLTFSCincinnatiPUG-LP_01---Registration-Page.html7Views1like0CommentsCincinnati Pure User Group: Real-time Enterprise File
Register Now => Join us for an exclusive Pure User Group (PUG) session dedicated to the future of file services. This isn't just a technical briefing; it’s a community gathering designed for peer-to-peer learning and strategic roadmap building. We’re diving deep into the Real-time Enterprise File vision—exploring how to unify your environment across FlashArray and FlashBlade to eliminate silos and escape the "forklift upgrade" trap forever. Whether you’re managing simple departmental shares or complex AI/ML pipelines, this is your chance to connect with local experts, share battle-tested insights, and see how to make your data plane as agile as your business demands. What You’ll Learn The Power of Choice: Understand how Pure’s file capabilities span the entire portfolio. We’ll clarify exactly when to leverage FlashArray vs. FlashBlade for workloads ranging from VDI and VMware over NFS to massive AI/ML repositories. Production-Ready Excellence: Go beyond the basics with a look at the capabilities that matter in the real world: multi-protocol support (SMB/NFS), directory integration, Kerberos security, and multi-tenancy for segmented environments. The "Last Refresh" Strategy: Get practical, no-nonsense guidance on sizing and migration tooling. Learn how to consolidate legacy filers and execute a migration that ensures you never have to do a forklift upgrade again. Peer-to-Peer Wisdom: This is a user group first. You’ll hear directly from local customers about their real-world journeys—what worked, what didn't, and the lessons they learned that you can apply to your own data center tomorrow. Event Agenda 2:00 PM | Welcome & Round-the-Room: We start with quick intros. We want to know who you are and exactly what technical hurdles you’re looking to clear. 2:15 PM | The Real-time Enterprise File Vision: An overview of the vision and where the portfolio is headed. See what’s new and what’s next for FlashArray and FlashBlade. 2:40 PM | Deep Dive: Design Patterns & Use Cases: We’ll walk through common architectural designs for home directories, content repositories, and NFS datastores, including proven protection and recovery patterns. 3:10 PM | Customer Spotlight & Panel: A 25-minute interactive session with local peers. Hear their architecture stories and get your toughest questions answered in an open Q&A. 3:35 PM | Whiteboard Session: Your File Roadmap: An open, interactive conversation about your specific challenges—from unstructured data growth to migration blockers. Let’s map out where Pure can help. 3:55 PM | Wrap-up & Next Steps: Key takeaways, resources for your team, and a preview of our next PUG event. 4:00 PM | Networking & Happy Hour Date & Time January 28, 2026 2:00 PM - 4:00 PM EST Location Aces Pickleball 2730 Maverick Dr Norwood, OH 45212 (Factory 52)25Views1like1CommentStop Prompting, Start Context Engineering
This blog post argues that Context Engineering is the critical new discipline for building autonomous, goal-driven AI agents. Since Large Language Models (LLMs) are stateless and forget information outside their immediate context window, Context Engineering focuses on assembling and managing the necessary information—such as session history, long-term memory (embeddings, RAG indexes), and tool outputs—for the agent every single turn. The post asserts that storage, not the LLM or the prompt, is the primary performance bottleneck for AI at scale. The speed of the underlying storage architecture dictates the agent's responsiveness because it must quickly retrieve and persist context data repeatedly.55Views2likes0CommentsAnnouncing the General Availability of Purity//FB 4.6.6
We are happy to announce the general availability of 4.6.6, the seventh release in the 4.6 Feature Release line. See the release notes for all the details about these, and the many other features, bug fixes, and security updates included in the 4.6 release line. UPGRADE RECOMMENDATIONS AND EOL SCHEDULE Customers who are running any previous 4.6 version should upgrade to 4.6.6. Customers who are looking for long-term maintenance of a consistent feature set are recommended to upgrade to the 4.5 LLR. Check out our AI Copilot intelligent assistant for deeper insights into release content and recommendations. Development on the 4.6 release line will continue through February 2026. After this time the full 4.6 feature set will roll into the 4.7 Long Life Release line for long-term maintenance, and the 4.6 line will be declared End-of-Life (EOL). HARDWARE SUPPORT This release is supported on the following FlashBlade Platforms: FB//S100, FB//S200 (R1, R2), FB//S500 (R1, R2), FB//ZMT, FB//E, FB//EXA LINKS AND REFERENCES Purity//FB 4.6 Release Notes Purity//FB Release and End-of-Life Schedule Purity//FB Release Guidelines FlashBlade Hardware and End-of-Support FlashBlade Capacity and Feature Limits Pure1 Manage AI Copilot48Views2likes0CommentsCincinnati PUG 3
Well, it's the start of the year and time to start planning our next Cincinnati User Group. In our last meeting we discussed Cyber Resiliency with Shawn Snider, Chief Information Security Officer at SEHP. This session we look to tackle topics around Enterprise File. We have targeted January 28th, again at Ace's Pickleball. I or Nick Fritsch will post the verified details soon. Looking forward to more great discussion and collaboration!58Views3likes2CommentsPure Fusion Expert Demo: From Fleet Creation to Policy‑Driven Provisioning
December 16 | Register Now! Manual provisioning and reactive management can slow innovation and drain valuable IT time. What if you could manage your enterprise data intelligently? Join us for a Pure Fusion™ Expert-led Demos Webinar: Walk through Pure Fusion configuration and fleet creation to securely federate arrays and gain one, consistent data management experience across your environment. See remote provisioning in action—manage any array from any array and provision storage anywhere via GUI, CLI, or API. Learn how policy‑driven presets standardize protection, QoS, and naming for repeatable, error‑free deployments—and get AI‑driven placement recommendations. Register Now!32Views0likes0CommentsHow to Use Logstash to Send Directly to an S3 Object Store
This article originally appeared on Medium.com and has been republished with permission from the author. To aggregate logs directly to an object store like FlashBlade, you can use the Logstash S3 output plugin. Logstash aggregates and periodically writes objects on S3, which are then available for later analysis. This plugin is simple to deploy and does not require additional infrastructure and complexity, such as a Kafka message queue. A common use-case is to leverage an existing Logstash system filtering out a small percentage of log lines that are sent to an Elasticsearch cluster. A second output filter to S3 would keep all log lines in raw (un-indexed) form for ad-hoc analysis and machine learning. The diagram below illustrates this architecture, which balances expensive indexing and raw data storage. Logstash Configuration An example Logstash config highlights the parts necessary to connect to FlashBlade S3 and send logs to the bucket “logstash,” which should already exist. The input section is a trivial example and should be replaced by your specific input sources (e.g., filebeats). input { file { path => [“/home/logstash/testdata.log”] sincedb_path => “/dev/null” start_position => “beginning” } } filter { <code”>} output { stdout { codec => rubydebug } s3{ access_key_id => “XXXXXXXX” secret_access_key => “YYYYYYYYYYYYYY” endpoint => “https://10.62.64.200" bucket => “logstash” additional_settings => { “force_path_style” => true } time_file => 5 codec => “plain” } } Note that the force_path_style setting is required; configuring a FlashBlade endpoint needs path style addressing instead of virtual host addressing. Path-style addressing does not require co-configuration with DNS servers and therefore is simpler in on-premises environments. For a more secure option, instead of specifying the access/secret key in the pipeline configuration file, they should also be specified as environment variables AWS_ACCESS_KEY and AWS_SECRET_ACCESS_KEY. Logstash can trade off efficiency of writing to S3 with the possibility of data loss through the two configuration options “time_file” and “size_file,” which control the frequency of flushing lines to an object. Larger flushes result in more efficient writes and object sizes, but result in a larger window of possible data loss if a node fails. The maximum amount of data loss is the smaller of “size_file” and “time_file” worth of data. Validation Test To test the flow of data through Logstash to FlashBlade S3, I use the public docker image for Logstash. Starting with the configuration file shown above, customize the fields for your specific FlashBlade environment and place them in ${PWD}/pipeline/ directory. We then volume-mount the configuration into the Logstash container at runtime. Start a Logstash server as a Docker container as follows: > docker run --rm -it -v ${PWD}/pipeline/:/usr/share/logstash/pipeline/ -v ${PWD}/logs/:/home/logstash/ docker.elastic.co/logstash/logstash:7.6.0 Note that I also volume-mounted the ${PWD}/logs/ directory, which is where Logstash will look for incoming data. In a second terminal, I generate synthetic data with the flog tool, writing into the shared “logs/” directory: > docker run -it --rm mingrammer/flog > logs/testdata.log Logstash will automatically pick up this new log data and start writing to S3. Then look at the output on S3 with s5cmd; in my example the result is three objects written (5MB, 5MB, and 17KB in size). > s5cmd ls s3://logstash/ + 2020/02/28 04:09:42 17740 ls.s3.03210fdc-c108–4e7d-8e49–72b614366eab.2020–02–28T04.04.part28.txt + 2020/02/28 04:10:21 5248159 ls.s3.5fe6d31b-8f61–428d-b822–43254d0baf57.2020–02–28T04.10.part30.txt + 2020/02/28 04:10:21 5256712 ls.s3.9a7f33e2-fba5–464f-8373–29e9823f5b3a.2020–02–28T04.09.part29.txt Making Use of Logs Data with Spark In Pyspark, the log lines can be loaded for a specific date as follows: logs = sc.textFile(“s3a://logstash/ls.s3.*.2020–02–29*.txt”) Because the ordering of the key places the uid before the date, each time a new Spark dataset is created it will require enumerating all objects. This is an unfortunate consequence of not having the key prefixes in the right order for sorting by date. Once loaded, you can perform custom parsing and analysis, use the Spark-Elasticsearch plugin to index the full set of data, or start machine learning experiments with SparkML.20Views0likes0CommentsOT: The Architecture of Interoperability
In previous post, we explored the fundamental divide between Information Technology (IT) and Operational Technology (OT). We established that while IT manages data and applications, OT controls the physical heartbeat of our world from factory floors to water treatment plants. In this post we are diving deeper into the bridge that connects them: Interoperability. As Industry 4.0 and the Internet of Things (IoT) accelerate, the "air gap" that once separated these domains is evolving. For modern enterprises, the goal isn't just to have IT and OT coexist, but to have them communicate seamlessly. Whether the use-cases are security, real time quality control, or predictive maintenance, to name a few, this is why interoperability becomes the critical engine for operational excellence. The Interoperability Architecture Interoperability is more than just connecting cables; it’s about creating a unified architecture where data flows securely between the shop floor and the “top floor”. In legacy environments, OT systems (like SCADA and PLCs) often run on isolated, proprietary networks that don’t speak the same language as IT’s cloud-based analytics platforms. To bridge this, a robust interoperability architecture is required. This architecture must support: Industrial Data Lake: A single storage platform that can handle block, file, and object data is essential for bridging the gap between IT and OT. This unified approach prevents data silos by allowing proprietary OT sensor data to coexist on the same high-performance storage as IT applications (such as ERP and CRM). The benefit is the creation of a high-performance Industrial Data Lake, where OT and IT data from various sources can be streamed directly, minimizing the need for data movement, a critical efficiency gain. Real Time Analytics: OT sensors continuously monitor machine conditions including: vibration, temperature, and other critical parameters, generating real-time telemetry data. An interoperable architecture built on high performance flash storage enables instant processing of this data stream. By integrating IT analytics platforms with predictive algorithms, the system identifies anomalies before they escalate, accelerating maintenance response, optimizing operations, and streamlining exception handling. This approach reduces downtime, lowers maintenance costs, and extends overall asset life. Standards Based Design: As outlined in recent cybersecurity research, modern OT environments require datasets that correlate physical process data with network traffic logs to detect anomalies effectively. An interoperable architecture facilitates this by centralizing data for analysis without compromising the security posture. Also, IT/OT convergence requires a platform capable of securely managing OT data, often through IT standards. An API-First Design allows the entire platform to be built on robust APIs, enabling IT to easily integrate storage provisioning, monitoring, and data protection into standard, policy-driven IT automation tools (e.g., Kubernetes, orchestration software). Pure Storage addresses these interoperability requirements with the Purity operating environment, which abstracts the complexity of underlying hardware and provides a seamless, multiprotocol experience (NFS, SMB, S3, FC, iSCSI). This ensures that whether data originates from a robotic arm or a CRM application, it is stored, protected, and accessible through a single, unified data plane. Real-World Application: A Large Regional Water District Consider a large regional water district, a major provider serving millions of residents. In an environment like this, maintaining water quality and service reliability is a 24/7 mission-critical OT function. Its infrastructure relies on complex SCADA systems to monitor variables like flow rates, tank levels, and chemical compositions across hundreds of miles of pipelines and treatment facilities. By adopting an interoperable architecture, an organization like this can break down the silos between its operational data and its IT capabilities. Instead of SCADA data remaining locked in a control room, it can be securely replicated to IT environments for long-term trending and capacity planning. For instance, historical flow data combined with predictive analytics can help forecast demand spikes or identify aging infrastructure before a leak occurs. This convergence transforms raw operational data into actionable business intelligence, ensuring reliability for the communities they serve. Why We Champion Compliance and Governance Opening up OT systems to IT networks can introduce new risks. In the world of OT, "move fast and break things" is not an option; reliability and safety are paramount. This is why Pure Storage wraps interoperability in a framework of compliance and governance, not limited to: FIPS 140-2 Certification & Common Criteria: We utilize FIPS 140-2 certified encryption modules and have achieved Common Criteria certification. Data Sovereignty: Our architecture includes built-in governance features like Always-On Encryption and rapid data locking to ensure compliance with domestic and international regulations, protecting sensitive data regardless of where it resides. Compliance: Pure Fusion delivers policy defined storage provisioning, automating the deployment with specified requirements for tags, protection, and replication. By embedding these standards directly into the storage array, Pure Storage allows organizations to innovate with interoperability while maintaining the security posture that critical OT infrastructure demands. Next in the series: We will explore further into IT/OT interoperability and processing of data at the edge. Stay tuned!46Views0likes0CommentsHow to Improve Python S3 Client Performance with Rust
This article originally appeared on PureStorage.com. It has been republished with permission from the author. Python is the de facto language for data science because of its ease of use and performance. But performance comes only because libraries like NumPy offload computation-heavy functions, like matrix multiplication, to optimized C code. Data science tooling and workflows continue to improve, data sets get larger, and GPUs get faster. So as object storage systems, like S3, become the standard for large data sets, the retrieval of data from object stores has become a bottleneck. Slow S3 access results in idle compute, wasting expensive CPU and GPU resources. Almost all Python-based use of data in S3 leverages the Boto3 library, an SDK that enables flexibility but comes with the performance limitations of Python. Native Python execution is relatively slow and especially poor at leveraging multiple cores due to the Global Interpreter Lock (GIL). There are other projects, such as a plugin for PyTorch or leveraging Apache Arrow via PyArrow bindings, that aim to improve S3 performance for a specific Python application. I have also previously written about issues with S3 performance in Python: cli tool speeds, object listing, Pandas data loading, and metadata requests. This blog post points in a promising direction for solving the Python S3 performance problem: replacing Boto3 with equivalent functionality written in a modern, compiled language. My simple Rust reimplementation FastS3results in 2x-3x performance gains versus Boto3 for both large object retrieval and object listings. Surprisingly, this result is consistent for both fast, all-flash object stores like FlashBlade®, as well as traditional object stores like AWS’s S3. Experimental Results Python applications access object storage data primarily through either 1) object store specific SDKs like Boto3 or 2) filesystem-compatible wrappers like s3fs and fsspec. Both Boto3 and s3fs will be compared against my minimal Rust-based FastS3 code to both 1) retrieve objects and 2) list keys. S3fs is a commonly used Python wrapper around the Boto3 library that provides a more filesystem-like interface for accessing objects on S3. Developers benefit because file-based Python code can be adapted for objects with minimal or no rewrites. Fsspec provides an even more general interface that provides a similar filesystem-like API for many different types of backend storage. My FastS3 library should be viewed as a first step toward an fsspec-complaint replacement for the Python-based s3fs. In Boto3, there are two ways to retrieve an object: get_object and download_fileobj. Get_object is easier to work with but slower for large objects, and download_fileobj is a managed transfer service that uses parallel range GETs if an object is larger than a configured threshold. My FastS3 library mirrors this logic, reimplemented in Rust. S3fs enables reading from objects using a pattern similar to standard Python file opens and reads. The tests focus on two common performance pain points: retrieving large objects and listing keys. There are other workloads that are not yet implemented or optimized, e.g., small objects and uploads. All tests are run on a virtual machine with 16 cores and 64GB DRAM and run against either a small FlashBladesystem or AWS S3. Result 1: GET Large Objects The first experiment measures retrieval (GET) time for large objects using FastS3, s3fs, and both Boto3 codepaths. The goal is to retrieve an object from FlashBlade S3 into Python memory as fast as possible. All four functions scale linearly as the object size increases, with the Rust-based FastS3 being 3x and 2x faster than sf3s-read/boto3-get and boto3-download respectively. The relative speedup of FastS3 is consistent from object sizes of 128MB up to 4GB. Result 2: GETs on FlashBlade vs. AWS The previous results focused on retrieval performance against a high-performance, all-flash FlashBlade system. I also repeated the experiments using a traditional object store with AWS’s S3 and found similar performance gains. The graph below shows relative performance of FastS3 and Boto3 download(), with values less than 1.0 indicating Boto3 is faster than FastS3. For objects larger than 1GB-2GB, the Rust-based FastS3 backend is consistently 2x faster at retrieving data than Boto3’s download_fileobj function, against both FlashBlade and AWS. Recall that download_fileobj is significantly faster with large objects than the basic Boto3 get_object function. As a result, FastS3 is at least 3x faster than Boto3’s get_object. The graph compares FastS3 against download_fileobj because it is Boto3’s fastest option, though it is also the least convenient to use. For objects smaller than 128MB-256MB, the FastS3 calls are slower than Boto3, indicating that there are still missing optimizations in my FastS3 code. FastS3 currently uses 128MB as the download chunk size to control parallelism, which works best for large objects but clearly is not ideal for smaller objects. Result 3: Listing Objects Performance on metadata listings is commonly a slow S3 operation. The next test compares the Rust-based implementation of ls(), i.e., listing keys based on a prefix and delimiter with a prefix, with Boto3’s list_objects_v2() and s3fs’s ls() operation. The objective is to enumerate 400k objects with a given prefix. Surprisingly, FastS3 is significantly faster than Boto3 at listing objects, despite FastS3 not being able to leverage concurrency. The FastS3 listing is 4.5x faster than Boto3 against FlashBlade and 2.7x faster against AWS S3. The s3fs implementation of ls() also introduces a slight overhead of 4%-8% when compared to directly using boto3 list_objects_v2. Code Walkthrough All the code for FastS3 can be found on GitHub, including the Rust implementation and a Python benchmark program. I leverage the Pyo3 library to create the bindings between my Rust functions and Python. I also use the official AWS SDK for Rust, which at the time of this writing is still in tech preview at version 0.9.0. The Rust code issues concurrent requests to S3 using the Tokio runtime. Build the Rust-FastS3 library using maturin, which packages the Rust code and pyo3 bindings into a Python wheel. maturin build --release The resulting wheel can be installed as with any Python wheel. python3 -m pip install fasts3/target/wheels/*.whl Initialization logic for Boto3 and FastS3 are similarly straightforward, using only an endpoint_url to specify FlashBlade data VIP or an empty string for AWS. The access key credentials are found automatically by the SDK, e.g., as environment variables or a credentials file. <code">import boto3 import fasts3 s3r = boto3.resource('s3', endpoint_url=ENDPOINT_URL) # boto3 <code">s = fasts3.FastS3FileSystem(endpoint=ENDPOINT_URL) # fasts3 (rust) And then FastS3 is even simpler to use in some cases. # boto3 download_fileobj() bytes_buffer = io.BytesIO() s3r.meta.client.download_fileobj(Bucket=BUCKET, Key=SMALL_OBJECT, Fileobj=bytes_buffer) # fasts3 get_objects contents = s.get_objects([BUCKETPATH]) FastS3 requires the object path to be specified as “bucketname/key,” which maps to the s3fs and fsspec API and treats the object store as a more generic file-like backend. The Rust code for the library can be found in a single file. I am new to Rust, so this code is not “well-written” or idiomatic Rust, just demonstrative. To understand the flow of the Rust code, there are three functions that serve as interconnects between Python and Rust: new(), ls(), and get_objects(). pub fn new(endpoint: String) -> FastS3FileSystem This function is a simple factory function for creating a FastS3 object with the endpoint argument that should point to the object store endpoint. pub fn ls(&self, path: &str) -> PyResult<Vec<String>> The ls() function returns a Python list[] of keys found in the given path. The implementation is a straightforward use of a paginated list_objects_v2. There is no concurrency in this implementation; each page of 1,000 keys is returned serially. Therefore, any performance advantage of this implementation is strictly due to Rust performance gains over Python. pub fn get_objects(&self, py: Python, paths: Vec<String>) -> PyResult<PyObject> The get_objects functions take a list of paths and concurrently download all objects, returning a list of Bytes objects in Python. Internally, the function first issues a HEAD request to all objects in order to get their sizes and then allocates the Python memory for each object. Finally, the function concurrently starts retrieving all objects, splitting large objects into chunks of 128MB. A key implementation detail is to first allocate the memory for the objects in Python space using a PyByteArray and then copy downloaded data into that memory using Rust, which avoids needing a memory copy to move the object data between Rust and Python-managed memory. As a side note, dividing a memory buffer into chunks so that data can be written in parallel really forced me to better understand Rust’s borrow checker! What About Small Objects? Notably lacking in the results presented are small objects retrieval times. The FastS3 library as I have written it is not faster (and sometimes slower) than Boto3 for small objects. But I am happy to speculate this is nothing to do with the language choice but largely because my code is so far only optimized for large objects. Specifically, my code does a HEAD request to retrieve the object size before starting the downloads in parallel, whereas with a small object, it is more efficient to just GET the whole data in a single remote call. Clearly, there is opportunity for optimization here. Summary Python prominence in data science machine learning continues to grow. And the mismatch in performance between accessing object storage data and compute hardware (GPUs) continues to widen. Faster object storage client libraries are required to keep modern processors fed with data. This blog post has shown that one way to significantly improve performance is to replace native Python Boto3 code with compiled Rust code. Just as NumPy makes computation in Python efficient, a new library needs to make S3 access more efficient. While my code example shows significant improvement over Boto3 in loading large objects and metadata listings, there is still room for improvement in small object GET operations and more of the API to be reimplemented. The goal of my Rust-based Fasts3 library is to demonstrate the 2x-3x scale of improvements possible to encourage more development on this problem.41Views0likes0Comments