N10-007 Compare and contrast risk related concepts

Disaster recovery

Disaster Recovery is undertaken when the fault tolerance measures fail. Backups and Backup strategies form the essentials parts of disaster recovery.

Backup Methods: There are several backup methods that can be chosen from. The choice of the method that is used for backup is controlled by the needs of the organization and the time that is available in which the backup must be taken. Backup is undertaken through a window and conducting the same outside the window tends to slow down the server. The different backup methods that are available are:

Full Backups: Also known as a normal backup it is the best type of backup that can be opted for. All the files are copied on the hard disk in this backup and the system can be restored with the help of a single set of files. This is not always seen as the best option as the quantum of data could be enormous and the same could take a long time. Administrators try to take full back ups during the off hours or do it at a time when the load on the servers is the least. These serve well in smaller setups with a limited amount of data. The determination of files that have been altered since the last time is done by software with the help of a setting known as the archive bit.

Incremental Backups: This is a much faster backup than full backup. In incremental backup, only the files that have experienced a change since the last incremental backup are included. It has to copy comparatively less amount of data than the full backup. The limitation is that the restoration of the files is done at a much slower rate. It requires that the full backup tape and the incremental backup tapes taken since the time the data has to be restored to be taken along. It is also necessary to store the incremental tapes in a chronological order

Differential Backups: In the case of differential backups, the files that have been changed or created since the last full backup are recorded. The advantage of this type over incremental is that it requires only two tapes to be considered; the full backup tape and the latest differential tape. These fall in the middle of full and incremental backups.

Business Continuity

All BC/DR plans need to encompass how employees will communicate, where they will go and how they will keep doing their jobs. The details can vary greatly, depending on the size and scope of a company and the way it does business. For some businesses, issues such as supply chain logistics are most crucial and are the focus on the plan. For others, information technology may play a more pivotal role, and the BC/DR plan may have more of a focus on systems recovery. For example, the plan at one global manufacturing company would restore critical mainframes with vital data at a backup site within four to six days of a disruptive event, obtain a mobile PBX unit with 3,000 telephones within two days, recover the company’s 1,000-plus LANs in order of business need, and set up a temporary call center for 100 agents at a nearby training facility.

But the critical point is that neither element can be ignored, and physical, IT and human resources plans cannot be developed in isolation from each other. (In this regard, BC/DR has much in common with security convergence.) At its heart, BC/DR is about constant communication.

Business, security and IT leaders should work together to determine what kind of plan is necessary and which systems and business units are most crucial to the company. Together, they should decide which people are responsible for declaring a disruptive event and mitigating its effects. Most importantly, the plan should establish a process for locating and communicating with employees after such an event. In a catastrophic event (Hurricane Katrina being a relatively recent example), the plan will also need to take into account that many of those employees will have more pressing concerns than getting back to work.

Battery backups/UPS

Power issues can never be ignored while discussing fault tolerance measures. Uninterrupted power supply (UPS) is the device that takes care of a regular power supply. It is a battery capable of built in charging and performs many functions when it comes to server implementations. In times of good power, the same gets charged and the battery is used in case of power failures. The objective is to ensure a safe shutdown of the server. The reasons that make a UPS necessary are:

  • Availability of Data: Access to the server is assured as long as the UPS even in case of power failure is saving the data.
  • Loss of Data due to Power Fluctuations: Power fluctuations do not work well with data on the server system. The data lying in cache could be lost in case of voltage fluctuations.
  • Damage to the Hardware: Power fluctuations are not good for the hardware components within a computer.

First responders

The time between discovery of an incident and the handover of digital evidence is critical for the possibility of successful evidence retrieval. Mishandled evidence, whether to be used in court or solely in house, can damage the integrity of the investigation. For instance, viewing pornographic images that were downloaded to an employee’s computer will change the time/date stamp. If this occurs, there is no way to prove that it was the employee that downloaded the images and not the network administrator.

The most critical concern, then, is to create the most conducive environment possible for the forensic examiner. The following points will discuss vital considerations for the administrator acting in a first responder’s role to maintain the integrity of evidence and accountability.

Avoid FUD

Fear, uncertainty and doubt will surely be some of your first reactions, especially in the instance of a network break-in. It is important to remember that you are not the first one that this has happened to and not to act rashly. For instance, if you notice that a system has been hacked into, it may be your first reaction to panic and pull the network cable. Although this can stop the attack, it may trigger a retaliatory routine planted by the hacker and cause further damage. By taking time to investigate and consult with a specialist, you may save the system from irreparable damage.

Data breach

A potential security breach would show some audit failures for logon or logoff attempts. To save space and prevent the log files from growing too big, administrators might choose to audit just failed logon attempts and not successful ones.

Each event in a security log contains additional information to make it easy to get the details on the event:

  • Date: The exact date the security event occurred.
  • Time: The time the event occurred.
  • User: The name of the user account that was tracked during the event.
  • Computer: The name of the computer used when the event occurred.
  • Event ID: The Event ID tells you what event has occurred. You can use this ID to obtain additional information about the particular event. For example, you can take the ID number, enter it at the Microsoft support website, and gather information about the event. Without the ID, it would be difficult to find this information.

Single point of failure

A single point of failure (SPOF) is a potential risk posed by a flaw in the design, implementation or configuration of a circuit or system in which one fault or malfunction causes an entire system to stop operating.

In a data center or other information technology (IT) environment, a single point of failure can compromise the availability of workloads – or the entire data center – depending on the location and interdependencies involved in the failure.

Consider a data center where a single server runs a single application. The underlying server hardware would present a single point of failure for the application’s availability. If the server failed, the application would become unstable or crash entirely; preventing users from accessing the application, and possibly even resulting in some measure of data loss. In this situation, the use of server clustering technology would allow a duplicate copy of the application to run on a second physical server. If the first server failed, the second would take over to preserve access to the application and avoid the SPOF.

Consider another example where an array of servers is networked through a single network switch. The switch would present a single point of failure. If the switch failed (or simply disconnected from its power source), all of the servers connected to that switch would become inaccessible from the remainder of the network. For a large switch, this could render dozens of servers and their workloads inaccessible. Redundant switches and network connections can provide alternative network paths for interconnected servers if the original switch should fail, avoiding the SPOF.

It is the responsibility of the data center architect to identify and correct single points of failure that appear in the infrastructure’s design. However, it’s important to remember that the resiliency needed to overcome single points of failure carries a cost (e.g. the price of additional servers within a cluster or additional switches, network interfaces and cabling). Architects must weigh the need for each workload against the additional costs incurred to avoid each SPOF. In some cases, designers may determine that the cost to correct a SPOF is costlier than the benefits of the workloads at risk.

Adherence to standards and policies

PCI Security Standards are technical and operational requirements set by the PCI Security Standards Council (PCI SSC) to protect cardholder data. The Council is responsible for managing the security standards, while compliance with the PCI Security Standards is enforced by the payment card brands. The standards apply to all organizations that store, process or transmit cardholder data – with guidance for software developers and manufacturers of applications and devices used in those transactions.

If you are a merchant that accepts payment cards, you are required to be compliant with the PCI Data Security Standard. You can find out your exact compliance requirements only from your payment brand or acquirer. However, before you take action, you may want to obtain background information and a general understanding of what you will need to do from the information and links here.

The PCI DSS follows common-sense steps that mirror security best practices. There are three steps for adhering to the PCI DSS – which is not a single event, but a continuous, ongoing process. First, Assess — identify cardholder data, take an inventory of your IT assets and business processes for payment card processing, and analyze them for vulnerabilities that could expose cardholder data. Second, Remediate — fix vulnerabilities and do not store cardholder data unless you need it. Third, Report — compile and submit required remediation validation records (if applicable), and submit compliance reports to the acquiring bank and card brands you do business with.

Vulnerability scanning

In a vulnerability test, you run a software program that contains a database of known vulnerabilities against your system to identify weaknesses. It is highly recommended that you obtain such a vulnerability scanner and run it on your network to check for any known security holes. It is always preferable for you to find them on your own network before someone outside the organization does by running such a tool against you.

The vulnerability scanner may be a port scanner (such as NMAP: http://nmap.org/), a network enumerator, a web application, or even a worm, but in all cases it runs tests on its target against a gamut of known vulnerabilities.

Although Nessus (http://www.nessus.org/nessus/) and Retina (http://www.eeye.com/Retina) are two of the better-known vulnerability scanners, SAINT and OpenVAS (which was originally based on Nessus) are also widely used.

Penetration testing

A penetration test is a proactive and authorized attempt to evaluate the security of an IT infrastructure by safely attempting to exploit system vulnerabilities, including OS, service and application flaws, improper configurations, and even risky end-user behavior. Such assessments are also useful in validating the efficacy of defensive mechanisms, as well as end-users’ adherence to security policies.

Penetration tests are typically performed using manual or automated technologies to systematically compromise servers, endpoints, web applications, wireless networks, network devices, mobile devices and other potential points of exposure. Once vulnerabilities have been successfully exploited on a particular system, testers may attempt to use the compromised system to launch subsequent exploits at other internal resources, specifically by trying to incrementally achieve higher levels of security clearance and deeper access to electronic assets and information via privilege escalation.

Information about any security vulnerabilities successfully exploited through penetration testing is typically aggregated and presented to IT and network systems managers to help those professionals make strategic conclusions and prioritize related remediation efforts. The fundamental purpose of penetration testing is to measure the feasibility of systems or end-user compromise and evaluate any related consequences such incidents may have on the involved resources or operations.