Pure Storage FlashBlade is leading a new category of storage. It’s a bold statement and one that Pure Storage believed was true from the day they launched FlashBlade. Three years in and well over $500M later, more customers are substantiating their belief every day.
FlashBlade is so unlike any prior category of storage that any such comparison is a classic “apples to oranges.” If you know anything about Pure Storage, you know that they design to break the mold. They’re not satisfied just to incrementally improve what currently exists. This holds true for FlashBlade, particularly in the context of the modern data era.
Categories, as we know them, have particular shared characteristics (by definition).
Let’s dig a little deeper into the particular characteristics of FlashBlade that are specifically relevant to “modern data” and don’t align with any prior categories of storage.
But First… What Is “Modern” Data?
Modern data is anything but basic. Modern data is always mission-critical (and often under attack). It’s born-digital, multi-dimensional, multimodal, geo-distributed, and never, ever, at rest. Oh, and it’s big. Really big.
- Born digital: Machine-generated data doesn’t all fit neatly in one database. It’s not even all generated in-house. It is multi-dimensional and can be very unpredictable.
- Multimodal: Files offer flexibility and compatibility for existing workflows. Objects offer compatibility and interoperability with cloud-native applications. Real-world applications are increasingly blending these in consolidated pipelines. By some estimates, enterprises will triple their unstructured file and object data requirements within the next three years.
- Always flowing: Billions of files and objects are generated, processed, and analyzed in real-time at scale.
- Geo-distributed: Modern data needs to be replicated for protection and/or distribution— sometimes from the cloud and sometimes to the cloud.
Perhaps the most important thing to note when managing modern data is that details matter. A platform optimized for modern data needs to deliver the same things we’ve talked about for years: performance, simplicity, and the ability to consolidate.
That’s performance and simplicity and the ability to consolidate.
When you’re forced to pick between performance or simplicity or the ability to consolidate, things go awry.
With modern data, the architecture and feature details really matter. They determine whether a platform actually can deliver all three things in a way that uniquely addresses modern data challenges.
Modern Data Demands New Performance Paradigms
Fast file storage is not new. However, most traditional architectures can deliver high performance for small or large files and sequential or random file workloads. But modern data requires all of the above at the same time. Because machines can generate a lot of different kinds of data and you often must both capture and analyze data in real-time! Because architecting for the future means building for the unknown, and the workload you have today may change tomorrow. And because true consolidation means not being limited in what types of applications can be combined to share infrastructure.
In contrast, FlashBlade delivers multi-dimensional file and object performance via a highly parallelized architecture. This is a key category-creation level differentiator and it’s fundamental to FlashBlade’s ability to take consolidation to the next level.
So why fast object? It’s an understandable question. In fact, one analyst commented to the Pure Storage team that “just a few years ago fast object storage was basically an oxymoron.” And he was right. Object storage was initially introduced as a simple way to store large amounts of archive and less mission-critical data. But Pure Storage couldn’t ignore the fact that cloud-native apps were using object as their default storage—a.k.a. persistence layer—and that that application design was shifting to align with these cloud-native concepts often based on Amazon S3.
Pure Storage saw a future in which some applications would require higher performance than can be delivered in the public cloud. One in which many organizations would need the ability to run Fast Object on-premises or in a hybrid cloud architecture. A future where multicloud—including the cloud you own on-prem—would need a fast object storage fabric. With this logic in mind, they (Pure Storage) very intentionally designed the underlying architecture of FlashBlade to skate to where the puck was going.
(Not Your Grandfather’s) Consolidation
Since the first SANs were introduced decades ago, storage providers (including Pure) have demonstrated the core benefits of moving away from siloed storage and consolidating. Consolidation translates to less stranded capacity, greater environmental efficiency, and fewer things to manage. It is also the key to enabling multiple applications to leverage the same data instead of duplicating it across silos. That’s why it’s been interesting to see, over the past decade, the resurgence of direct-attached storage (DAS) environments in which all resources (compute + storage, in particular) are required to grow together, regardless of need. It’s like adding an engine to a train every time you add a boxcar: Not only is it an inefficient and poor use of resources, but it’s costly and complex.
While there are significant advantages to consolidation, performance, and scale are still table stakes. And until FlashBlade, there wasn’t a single platform that could deliver the necessary multi-dimensional performance at scale to enable applications to move away from DAS or other siloed architectures and realize the benefits of consolidation.
Now let’s talk about scale.
In the context of consolidation, “scale” spans several vectors:
- First, the most common way to think about data scale is in GBs, TBs, and PBs. But that’s really just one aspect of scale. Similarly critical is the number of files or objects in a data set, which needs to be able to reach tens or hundreds of billions in modern data applications
- Second is performance—the ability to start with high levels of performance, but also predictably grow that performance as needed.
- Third is the ability to unlock that performance in multiple dimensions. In order to consolidate workloads with different IO needs and patterns, a solution must provide high performance simultaneously in multiple dimensions.
- Last, and perhaps most important, is how you scale. The requirements of modern workloads are rarely known upfront or static, so the ability to scale non-disruptively and on-demand is critical in consolidated environments.
Simplicity: Modern Data Knows that Less Is Always More
Simplicity is one of the key reasons that organizations have gravitated toward public-cloud offerings, and understandably so. “Complicated Fast File and Object” not only doesn’t fit the bill— it defeats the purpose. And it fails to solve the cost, inefficiency, and complexity challenges of either disparate silos or hard-to-manage DAS spread out everywhere.
But simplicity, like performance, is multi-dimensional and requires a lot of up-front design work to get right. Sometimes, a big part of that simplicity isn’t even in the storage layer itself. It's often seen that the bigger challenge is the complexity of getting the networking right. Pure took this challenge head-on, they didn’t just stop at the water’s edge and leave networking to be an exercise to the reader. Instead, they incorporated and integrated networking into FlashBlade.
If it was simple to build a scale-out system that is accessible as a single IP address and dynamically load-balanced across all blades, we wouldn’t see it take 10x longer to set up other environments compared to FlashBlade. Competitive platforms require extensive tuning of esoteric settings—like node counts, disk aggregates, and pools, and protection schemes, and background job timing, and on, and…—to deliver performance. But FlashBlade handles some of the world’s most demanding workloads—including massive AI and chip-design environments—without manual tuning. Setting up replication on FlashBlade takes just two steps vs. a stack of manuals and a Ph.D. in storage, networking, and running science experiments. Now, more than ever, this really matters to modern data.
Unified Fast File and Object: A New Category of Storage
Every time a new customer comes on board, two trends stand out in the details they share about why they chose FlashBlade:
- Customers recognize that they have new modern data challenges and they want to ensure that their investment can address their needs today and tomorrow. They deploy FlashBlade for an initial application—ransomware recovery, for example—with the explicit intent to leverage it to consolidate other emerging use cases such as log analytics of AI. And as more organizations are discovering, architecting for tomorrow means architecting for the unknown, which makes multidimensional flexibility critical.
- They use the word “only,” A LOT. As in “only FlashBlade could reduce tape-out times by 4x.” Or “only FlashBlade was up and running in hours—not days.” And “only FlashBlade can enable this organization to meet the needs of its Elastic environment and recover from a ransomware attack inside of its SLA.”