Are You Ready for the Data Explosion?
Businesses and other enterprises have always been dependent on reliable information about their customers and on accurate tracking of business transactions. This dependency has only grown in the age of computing, and has reached the point where it is paramount in the operations of most businesses.
With the growing recognition of the value of all business data comes the evolving requirement to retain more of that data for longer time periods, adding to overall data volume. According to IBM, 90 per cent of the world’s data has been created in just the past two years, and each day adds another 2.5 quintillion bytes of new data. This means that overall, enterprise databases are steadily growing.
Although it doesn’t affect every organization, the parallel emergence of Big Data encourages this trend. Big Data – a term that summarizes the usefulness of being able to analyze larger data sets than ever before to extract new insights about customer behavior, buying trends, and their general decision-making – is growing in its potential value. Big Data promises to give an edge to companies who can gather and meaningfully interpret information about customer needs and actions to meet expectations and create new opportunities.
So what does this data growth mean for today’s organizations?
Dealing with the Data Deluge
Whether Big Data is part of your enterprise’s strategic plan or not, safeguarding transactional, database, and other information from accidental deletion is of critical interest. While computer systems such as the IBM i are more reliable than ever, no system is failure-proof. Malfunctions of servers, communication links, or internal hardware, as well as human error, local calamities, and other mishaps can disrupt computer operations or destroy data archives at any time for unpredictable periods. Recognition of this reality is the basis for responsible IT departments focusing on providing high-availability (HA) and disaster-recovery (DR) alternatives to which they can turn in the event of problems.
Briefly put, HA and DR plans call for having a backup system to which all transactions and data are passed from a production server. That way, if the production server develops a problem, processing can either be transferred to the backup system (HA), or at least all data can be retrieved from it when the production system is back online (DR).
The goal of both plans is to minimize disruption to normal business processing if it should happen. But even if such disruptions don’t occur, such systems prove their worth by also protecting data and how well it can be accessed.
Meeting the Need
Maxava, a leader in HA and DR solutions for IBM i since 2000, offers multiple products for meeting such contingencies, either on-premise or as hosted services. The Maxava HA suite, available in three versions (Data Stream, SMB, and Enterprise+), either as a licensed product or as a cloud-based service, can replicate data and objects in real-time to any number of IBM i systems located anywhere in the world. Maxava’s maxView Manager module additionally lets designated personnel remotely manage the Maxava HA environment via a browser or mobile devices. This makes Maxava HA a strong candidate to serve as the foundation of any HA/DR solution.
Recently, Maxava further enhanced its existing replication for the Integrated File System (IFS) on IBM i servers to provide multi-threaded replication for its most popular Enterprise+ product. This enables multiple IFS processes to replicate the source data in parallel, making data throughput dramatically faster to meet anticipated demand by the market for even higher efficiency. Companies with large IFS volumes and potential interest in Big Data will be able to make best use of these improvements.
Higher replication speeds are vital whether your enterprise is simply wanting to maintain customer transaction records or to support full-blown Big Data projects.
Maxava executives recently sat for an interview to emphasize the importance of HA/DR plans, the utility of Maxava HA, and to cover in more depth the advantages of the product’s features to a durable HA/DR plan.
HA and DR: Not Just for Catastrophes
One might think that news stories about data disasters hitting some major retailers in recent months might be a catalyst for other enterprises seeking ways of avoiding a similar fate. Surprisingly, that isn’t necessarily so.
“There’s a lot of awareness these days about disasters, mainly due to sophisticated media coverage, so more people are thinking about the risk proactively, rather than reacting to the latest disaster,” notes Simon O’Sullivan, Maxava’s senior vice-president. “While you might expect we’d get an influx of enquiries after a natural disaster, and we do get some, actually most businesses are aware of the risks and already thinking about business continuity.”
“The more common drivers for a business to seek an HA/DR solution are about ticking boxes for auditors or insurance, and the fact that more and more data is being kept,” he continues. “For example, ten or even five years ago the changes happening to a system might be 1 GB or less per day, which was relatively easy to re-create. Now it’s not unusual for us to come upon customers with hundreds of gigabytes of changes per day – which used to be the entire database in the old days. Organizations today are more proactive, and more aware of the changes and the volume of them – and don’t want to lose them.”
Having a strong replication option for preserving that data is as important as the data itself.
Staying on Top of Data with Sub-second Replication
Particularly in environments with high transaction volumes, anything that adds to the speed of replication helps avoid losing any data if a service interruption occurs. Maxava HA is designed to provide the highest speeds that technology allows.
“We aim for sub-second replication; we want to protect the customer to the very last transaction so our replication has to be sub-second, in real time the moment a transaction is created on the target server,” O’Sullivan points out. “The sending of transactions is handled by remote journaling. We ensure that we keep up with the speed of the entries onto the target system.”
This guarantees that the opportunities for data loss are minimal.
Also contributing to building HA and DR plans are such business parameters as the customer’s recovery time objective (RTO), which is basically the amount of time an enterprise can tolerate not having access to normal processing, and the recovery point objective (RPO), which is a measure of how much data loss the organization can withstand.
These measures must be determined independently for each enterprise but are an essential part of assuring overall data availability. An enterprise that operates 24/7 will have drastically different needs to organizations that function primarily during standard business hours, for example.
“Our customers tell us what their RTO and RPO are, then we look at their data volumes and we advise them on the environment that’s needed for them to achieve that (i.e., the quality of the pipe/network, the hardware and the processors),” O’Sullivan explains.
“If an organization’s replication is not performing, the RPO is impacted and may not meet business needs. That means that in the event of an outage, they can only recover back to a certain point, and there could be one or many transactions that were not replicated and are therefore lost.”
“It’s also important that the network is capable of supporting the volume of changes to be sent to the target system. We run a discovery process that looks at the number of changes and the volume of data, then we recommend bandwidth that will assist in delivering as close to real-time replication as is possible. We would generally recommend a dedicated LAN port for journal replication to avoid competing with other traffic,” O’Sullivan adds.
Recovery from Failover in as Little as Five Minutes
Two other important considerations for an enterprise are the recovery and failover times. In DR situations, recovery is the time it takes to restore normal business operations after an event. With HA, failover refers to how long it takes for transaction processing to be switched from the production system to the backup server, also called a role swap.
“People are always amazed when they learn that recovery and failover from an outage can really be as quick as five minutes” O’Sullivan recalls. “However there is generally more to take care of than just the promotion of the backup server to production status. The time to a fully operational system is dependent on the customer applications running on the server(s), the complexity of the environment, and the size of database.
It’s also dependent on the preparation that the IT team is willing to put into getting failovers right. A huge part of a successful failover comes down to practice and testing. For example, we’ve observed with our most proactive customers that regular role swaps help organizations get faster every time. They really can get it down to five minutes or less. Swapping from primary to backup is easy, but making the backup available to users is generally more complex. The variables are remote communications, other servers that rely on the IBM i, network switches and routers, multiple service providers, and so forth.”
And what about while one machine is out of action?
“While there is only one machine running, transactions cannot be replicated outside of that machine,” observes O’Sullivan. “We always recommend that a customer have an appropriate backup strategy in place for this contingency, until a backup machine is available again. Once a second (replacement) machine has been deployed, then with Maxava installed, transactions will be replicated between the existing production server and the new machine that will become the new production server. At an appropriate point a role swap will promote the new server to production.”
There are actually two protocols for moving data between two servers, synchronous and asynchronous, and the difference can be significant. The difference matters, because it affects your replication speed – which directly impacts the RPO and RTO. Synchronous seems more reliable but is actually slower because it transmits data in packets and requires an acknowledgement signal from the receiving system to be sent back and received by the sending system before the next data packet is transmitted. While this is useful in situations where the communications link might abruptly be broken, it takes more time to achieve and is less effective if large geographical distances are involved. The time factor might be critical in high-volume environments, enabling loss of some transactions while the two systems are verifying their communications. With an asynchronous protocol, data is simply forwarded continuously, enabling the production server and the copied data on the backup server to differ by as little as fractions of a second.
“Maxava HA supports both synchronous and asynchronous protocols,” relates O’Sullivan. “It becomes an education exercise in that we explain the protocols so the customers fully understand the ramifications of each option, and we help them decide on the best protocol to use. The choice between synchronous and asynchronous protocols is driven by a number of factors, but the main consideration is the location of the two machines. If the target machine is in a remote location, asynchronous is almost always recommended to avoid response delays between source and target. The risks of using asynchronous protocol are mitigated by the use of remote journaling. The source and target operating systems constantly check the data integrity of the journals and journal receivers to ensure they are not compromised. In the event, 99 per cent of our customers use asynchronous because the benefits outweigh the negatives. For the Maxava software, it’s a very simple yes/no option inside remote journaling.”
Streamline Your IFS Replication
The announcement of an upgrade to add multi-threaded IFS replication to Maxava HA introduces a new feature, the sophistication of which is not currently available in other HA solutions for the IBM i. Multi-threaded replication lets multiple IFS replication processes run in parallel and makes replication more efficient by using multiple apply groups and automated trigger points. Trigger points are thresholds established in advance that activate if too much unprocessed data accumulates on the backup system.
Maxava HA is designed to process larger volumes of IFS data in less time.
“Replication operates faster because the multi-threaded replication enables multiple concurrent processing of data items, which is more efficient than processing data items sequentially,” explains Peter Kania, Maxava’s director of technical services and development. “The more concurrent apply processes you have, the greater throughput you can achieve, depending on system resources such as memory, disk, disk arms, and data spread. Our customers can be caught up faster on their target systems, so even when they have a large backlog of data they can be ‘role swap ready’ and switch to the target machine far more quickly.
“On the other hand, sequential processes require each task to be completed before a new task is anticipated and then initiated,” Kania continues. “The result is that not only are jobs processed one at a time, but the same types of job are re-handled over and over again instead of handling a collection of jobs at once for better efficiency. Finally, the sequential processing of a single, large job, can hold up hundreds of different, smaller jobs. When concurrent processing is in use the smaller jobs can all be run at the same time, and often finish before the large job is complete.
“Customers can define the minimum/maximum number of processes/apply groups. They can modify the run characteristics using standard OS400 performance adjustments. Also, the customer can temporarily change the maximum, on the fly, while it’s running – without stopping replication.”
Trigger points are also important for operational flexibility.
“Customers can choose to fix the number of parallel processes, but that doesn’t allow for any dynamic adjustment based on the backlog. The benefit of that, though, is that the customer sees a consistent number of jobs running on the system and we don’t consume additional CPU cycles that may be required for other work,” Kania elaborates.
Rev Up Replication for Your Operations
Maxava includes some other features and add-ons that enhance replication options for customer sites.
“We have a function that lets you define certain data for replication, and also for omission,” O’Sullivan explains. “For example, you could omit temporary files which are not required to be replicated. Or you could omit a particularly large file that would slow replication. The function can be used on objects and/or libraries, so you can define just certain libraries, or just certain objects within certain libraries for replication. You can also redirect from a source library to a differently named target library on the backup system. This means you can replicate multiple different systems that have the same naming convention to a single source, instead of being limited to a one-to-one replication model.”
“Maxava’s maxView Manager enables customers to not only monitor but to control their HA environments,” he continues. “They can start and stop configurations, start and stop remote journals, run audit processes, check WRKACTJOB or WRKSYSSTS information and view historical status information and Library, IFS and QDLS size information. You can also even run required commands or CSF scripts all from your PC, tablet or smartphone.”
Authenticating remote-device users isn’t a problem, either. “In order to access the maxView servers you need to be able to connect to your systems via your corporate network. Connection authentication is via your IBM i user profile. So security is just the same as logging into a standard session. If you are outside your corporate network then you will need to connect via a VPN initially on your device, so again security is controlled via the VPN tunnel and then your IBM i profile,” O’Sullivan describes.
This ensures that access to all data remains available only to authorized users.
HA and DR in the Cloud: A Very Real Option for IBM i
Although many of O’Sullivan and Kania’s comments presume that affected customers are licensing Maxava’s products for use at their own sites, another option is to access Maxava HA via the cloud. This can be accomplished either via remote use of Maxava software by customers, or by letting Maxava or an MSP partner administer the software instead of the customer.
“In many cases, Maxava’s cloud offering is much better than a customer owning and managing their own DR server,” clarifies O’Sullivan. “Many customers don’t have the resources or experience to effectively monitor and manage a DR environment. Maxava or another service provider is always much more in control, we can see that everything is up and running and not backlogged. Testing is done regularly. We have greater visibility of processes and procedures if we’re managing them.”
“For example, there are a number of areas that need constant monitoring to avoid replication backlogs or critical objects not being backed up. Some of these items could be network issues like router or switch failures, issues caused by network reconfigurations, or even problems in the communication links between sites. Customer-initiated application or system changes can create issues if the replication settings are not updated to reflect modifications where appropriate. A cloud DR environment monitored and managed by Maxava or one of its allied managed service providers (MSPs) provides dedicated DR specialists who monitor and manage the replication to ensure potential issues don’t become problems, and to identify any event that is likely to affect operations or recovery. An additional benefit of Maxava’s cloud solution is that it can be more cost-effective than a customer maintaining their own production and DR environment.”
Bigger – Faster – Moving into the Future
Faster data replication, whether it’s the structured data found in databases, or unstructured data such as email correspondence from customers and comments extracted from social media, is likely to become a fundamental requirement in an enterprise’s future. Maxava’s addition of new IFS-related capabilities to its product reflects the company’s attention to find new ways to continue modifying its offerings to meet customer objectives.
“The multi-threaded IFS is designed to help users process data faster on their backup system, which helps manage growing data volumes better,” concludes Kania. “The IFS is getting bigger and bigger as people are keeping more data. Maxava can help because of our ability to handle large data volumes and the fact that if everything’s set up correctly our product just doesn’t let backlogs accumulate.”
As large data sets grow to include input from such sources as humans, mobile devices, software logs, and sensory technologies, backlogs are more likely to accumulate. Like the weather, business needs can change their direction and nature in short periods. Faster data replication helps end users avoid such backlogs as they seek to access and analyze the most complete sets of information for whatever their business may need. The current direction of business points to a growing imperative to turn data into information, and streamlining that process will become increasingly important to all market segments.