Managing security is managing risk. As explained in Chapter 1,
Security ensures the confidentiality, integrity, and availability of information assets through the reasonable and appropriate application of administrative, technical, and physical controls, as required by risk management.
In Chapter 1, we explored risk at a high-level. As security practitioners, however, we need a working definition supported by the tools necessary to measure, report, and mitigate unwanted risks to physical and electronic information assets. In this chapter, we step through a risk assessment of a medium-sized business’ customer invoicing database. Our goal is to determine the risk of a threat agent gaining direct access to sensitive customer data.
Information Security Risk Management (ISRM)
For security practitioners, ISRM is
…the proper application of business risk mitigation tools and methods resulting in the implementation of security controls that, when operating properly—either alone or as part of a layered set of safeguards—mitigate business risk associated with an information system to a level acceptable to management. This must be done in a way that maintains the highest possible operational effectiveness of the personnel and processes using the systems protected by these controls (Olzak, 2008, p. 3).
Simply, it is our job to reduce the probability that a threat agent will exploit a vulnerability and cause significant harm to the business or its customers, employees, investors, or the public in general. Figure 2-1 is a different approach to the risk formula introduced in Chapter 1.
In our new formula, I replace probability of occurrence with means, opportunity, and motive. Reactively, investigators use these to identify subjects. Proactively, we can use them to understand how a criminal might look at our information assets.
Figure 2-1: Modified Risk Formula
Means, Motive, and Opportunity
Probability of occurrence traditionally translates to (threats * vulnerabilities). In Figure 2-1, threats break down to means and motive. Opportunity is another way of describing the physical and logical doors and windows left open. In other words, a threat possesses skills or capabilities (means) needed to satisfy financial, political, personal, or other objectives (motive). The threat uses a threat agent or action to launch an attack or cause unwanted network or system effects.
Motive is often the most important variable. For example, a person with a strong motive might relentlessly pursue his target. Attackers with weak motivation might simply give up after hitting the first difficult prevention control. Understand the possible targets within your organization and how criminals, terrorists, vigilante groups, etc. might perceive them.
The ISRM Process
ISRM consists of three phases: assess, mitigate, and manage. Think of them as milestones on a continuous journey, as shown in Figure 2. The journey begins with a risk assessment. The assessment helps us understand risk as calculated in the formula in Figure 2-1. Understanding risk and the elements that contribute to it, we make recommendations to management to mitigate risk where necessary. Finally, we manage our controls, monitoring and measuring to ensure expected risk mitigation results. Periodically, we assess again to identify new threats, vulnerabilities, or other changes that increased our risk beyond the acceptable threshold.
Figure 2-2: ISRM Process
The following is a high-level look at the 3 phases and 10 steps needed to complete them.
Step 1: System Definition
Step 2: Threat Identification
Step 3: Vulnerability Identification
Step 4: Attack Path Controls Assessment
Step 5: Impact Analysis
Step 6: Risk Determination
Step 7: Controls Recommendations
Step 8: Action Plan and Proposal Creation and Presentation
Step 9: Implement Controls
Step 10: Measure and Adjust
In the Assess phase, we create a detailed description of current state. The assessment target might include
- A network segment
- A system, defined as a collection of devices used to deliver a business process, such as payroll
- An individual device
- All data center segments
- A remote network
- A cloud service provider
It is not important what you select. However, the assessment target should be clearly delineated and a potential target. In our customer data server example, the assessment begins with a close look at the database server. The process appears sequential, moving forward from one step to the next. However, assessors often find themselves revisiting previous steps as they discover new information.
Step 1: System Definition
System decomposition breaks down our database server into the various components of its potential attack surface (Olzak, 2011(b)):
- Entry points. Methods used by the server software to receive information.
- Exit points. Points where data moves to other systems.
- Direct exit points provide data to other systems without any intermediate stages.
- Indirect exit points provide data to the direct exit points.
- Data channels. Protocol-enabled pathways over which data travels between the server and other network-attached devices. Examples include TCP and UDP sockets, remote procedure call (RPC) endpoints, and named pipes.
- Untrusted data items. Persistent entities attackers use to control systems or extract data. Examples include files, cookies, database records, and registry entries. Attackers cause direct entry points to read from untrusted data items or use entry points to write into untrusted data items (Manadhata, Karabulat, & Wing, n.d.). They are threat agents used to control a device or system.
System information is usually available from four sources (Olzak, 2011):
- Existing documentation. System, server, desktop, and network device implementations are typically documented as part of consistently followed change and architecture processes. If these documents do not exist when starting an assessment, this is your first vulnerability. For our target server, documents minimally should include
- Network diagrams
- Change management requests and related documentation
- System design documents
- System build documents with expected security configurations, including
- Operating system
- Database management system
- Other installed software
- Interviews. Even the most conscientious IT department misses a document or document change. Further, it is important to confirm that change and architectural management processes are actually followed. Interviewing key staff members is a good way to understand how well they manage the server. See Appendix A for a list of sample questions.
- Questionnaires. Employees are not always available, and sometimes managers prefer to simply respond to an email. In these cases, questionnaires asking interview questions might help. However, a person’s reaction to a question can sometimes indicate an answer falling somewhat distant from the truth. Additional questions might help clear up the issue. Questionnaires deny face-to-face interaction.
- Network scans. Once you gather all target system information, confirm with scans. Confirm that the server actually exists within the network and security contexts described by documentation and staff. For example, does the server actually sit in a restricted network segment or is the data center set up as a flat network. Scans can be automated or visual.
After a close review of all information gathered, we create our own logical representation of the database server’s role in the accounts receivable system, as shown in Figure 2-3. Do not make your diagrams so complicated that managers, business users, and IT staff cannot confirm that you clearly understand what you think you understand…
Using our logic diagram, we quickly can see the basic layout of the system supported by the Microsoft SQL database instance. As configured, only two data channels should exist: one from the application server and another from the database administrators’ (DBA’s) end-user devices. The system is configured according to integrity-enforcement best practice, with no direct business-user access to the database.
Figure 2-3: Logic Diagram
Next, we use our diagram once again to meet with our assessment contacts to determine or verify how they manage
- Authentication. How do employees, services, devices, etc. (subjects) verify their identities before accessing the system overall or the database server only (objects)? A subject is an entity attempting to gain access to a resource, an object. A subject can also be an object, depending on its role in a process.
- Authorization. Once a subject verifies its identity, how do security controls enforce
- Separation of duties – ensuring no one subject can perform all tasks associated with a business process.
- Least privilege – ensuring subjects can only modify, add, or delete data as required by their roles in a business process.
- Need-to-know – ensuring subjects “see” only the data elements required by their roles in a business process.
- Accountability. Do controls track what a subject did and when?
- Does each subject have a unique account ID and password?
- Are transactions logged? Are the logs protected?
- Is non-repudiation enforced? (Non-repudiation ensures a subject, typically a human, cannot deny accessing or modifying data or other objects.
- Attack surface. How are entry points, exit points, and data channels protected? How do controls prevent, detect, or react to untrusted data items?
- How do subjects input or access data via the Web server? Do Web server applications validate input?
- How do subjects input or access data via the application server? Do server applications validate input?
- Who has access to change operating system or application configurations? How are changes made, tracked, etc.?
- How do DBA’s perform maintenance tasks? How well are their workstations secured?
- What are the direct exit points and how are they secured?
- What are the indirect exit points and how are they managed?
- How are system interfaces configured?
Existing controls identification
Our database server does not exist in isolation. It is part of a system, which might be part of a network segment, which might be part of an enterprise network. Each of these infrastructures contributes to the server’s overall security context, as shown in Figure 2-4. The overall risk of database server breach is the aggregate risk of its security context. Controls in place or missing at each of the layers represented—there might be more or fewer layers in a specific organization—determine overall risk.
Figure 2-4: Security Context
Follow the paths…
One way to assess a system or individual device risk is by using a simple network diagram. A network diagram used for a risk assessment does not necessarily require the level of detail needed during architecture design activities. What is required, however, is a clear picture of what a threat agent/action will face when attempting to reach its target. Figure 2-5 is the network diagram we use for our sample organization, Erudio Products.
Figure 2-5: Current State Network Diagram
As it should, the network diagram tracks closely with the system logic diagram. The difference is in how we perceive an attack path. The logic diagram depicts application modules and their relationships. The network diagram is better at showing the logical—or physical—path to a potential target.
In Figure 5, we see two exposed paths to the database server: remote access and direct connection within the corporate offices. These paths are highlighted in Figure 2-6. Each approach should touch one or more security layers. Identifying the layers helps identify current risk.
Figure 2-6: Potential Attack Paths (in red)
Before starting controls analysis, it is helpful to base your assessment on a standard of best practice and any relevant regulatory requirements. This helps identify all possible challenges and demonstrates to management and auditors why you recommended mitigating controls. Chapter 1 provides a look at the most common regulations. Two common standards of best practice are COBIT (http://mcaf.ee/spct7) and ISO/IEC 27002:2005 (http://mcaf.ee/mkl2c).
Using documentation, interviews, our selected standard of best practice, Sarbanes-Oxley guidance, and our own scanning of the network, we identified the following control challenges.
- Physical Security
- The data center contains the switch and servers along both attack paths. A cypher lock secures entry. However, Erudio has not change the combination in two years.
- No cameras or other physical detection/monitoring devices exist.
- On-site security guards staff the main floor after business hours and make regular rounds.
- Erudio does not require employees to openly display company ID badges. ID badges are not checked upon facility entry during business hours.
- Most desktop computers sit in cubicles in common areas.
- Open Ethernet ports, leading directly to the enterprise switch, are available for use in the cubicles and in the conference rooms.
- Logical Security
- Anti-malware software is installed and up-to-date on all appropriate devices.
- Password-based authentication in in place.
- Erudio enforces strong passwords
- Most passwords change every 90 days
- The domain administrator password has not changed in two years
- The database administrator password has not changed in three years
- Service account passwords never change
- Privileged account passwords on LAN/WAN devices never change
- SQL authentication is used between the application server and the database server
- The patch management policy is not consistently followed
- Microsoft SQL Server is behind several patch levels. The operating system on the database server is also behind several patch levels.
- The application server was recently upgraded. Application and operating system patches are up to date.
- The Web server is behind several patch levels.
- The firewall patches are up to date.
- The enterprise switch patches are up to date.
- No log management processes exist.
- No event monitoring and reporting solution exists.
- Erudio uses a flat network model. In other words, the network is one, large, network segment.
- HTTPS security is used for remote connection to the Web server
- Administrative Security
- A security program exists. However,
- Erudio does not monitor for compliance
- Sanctions do not exist for non-compliance
- The system annually undergoes an external audit for Sarbanes-Oxley compliance.
- No third-party assessments are scheduled or being considered.
- A change management process exists to ensure compliance for new solutions and upgrades.
- A security program exists. However,
Based on comments from my students over the years, this list likely fits a large number of organizations. In addition, Erudio is required to comply with the PCI DSS. However, it is difficult to extract probability of occurrence from a bulleted list and a network diagram. Thankfully, another tool makes risk analysis a simpler task. In Step 4, we use an attack tree to organize and extract meaning from these and additional findings.
Affected business processes
Again, few systems operate in isolation. Most business systems receive data from upstream processes and send processed information to downstream processes, as shown in Figure 2-7.
Figure 2-7: Process Dependencies
Systems in the data center support business processes. When a system fails, one or more processes are affected. In the simple example above, if invoicing fails, the path from order processing to shipping might also fail. That is the situation at Erudio products if our target invoicing system is unavailable. Failure to ship means unhappy customers, an important consideration when assessing business impact.
Finally, be sure to identify each process’ maximum tolerable downtime (MTD). In other words, how long can a process or process set be unavailable before the business suffers serious irreparable harm?
Business impact also depends on the sensitivity of data affected by a breach or other security incident. Data sensitivity is measured by using a classification scheme, as shown in Table 2-1. This classification scheme might not work for your business; that is not important. What is important is having some way to identify various levels of information sensitivity for risk identification and control design.
Table 2-1: Data Classification Scheme
Data’s classification depends on the impact on the business, customers, investors, employees, and the public of unauthorized disclosure, unavailability, or modification of a specific data set. Examples of data sets usually falling into the restricted classification in Table 1 include
- PII or credit information. Personally identifiable information, or PII, is any combination of personal attributes that uniquely identify a single individual. Examples include name, address, date of birth, social security number, etc. Attackers targeting PII are usually highly motivated by the financial gain realized by selling the information. PII allows a criminal to masquerade as a data-theft victim to derive some benefit. In addition, data elements used to make credit purchases or manage financial accounts fall into this classification set, including credit card or banking account numbers.
- ePHI. Electronic protected health information (ePHI) is regulated by the HIPAA (See Chapter 1). It is similar to PII but goes further and includes information about a person’s health, care, or medical insurance.
- Intellectual property. Intellectual property provides competitive advantage and might be the organization’s reason for existence. It is either created by the organization or purchased from another party. In either case, loss of intellectual property might result in significant loss of revenue or business failure.
- Organization financial information. Regulations such as the Sarbanes-Oxley Act or investor audit requirements might require protection of internal financial information.
- Network access and configuration information. One of the phases of a successful attack is retrieving information about the target network. Not protecting the following information makes remote or local network footprinting easy.
- IP addresses
- Server names
- Switch or router configurations
- Account names and passwords
- Make, model, configuration, and operating system levels of firewalls, routers, switches, and intrusion protection/detection devices
- Operating system versions and patch levels
- Organization structure. Making its internal organization chart and phone listings publicly available might not seem like a big risk to most organizations, but it is a very big win for social engineers. Knowing who does what and where, and their phone numbers, is a win for phishing, spear phishing, or other social engineering attacks.
Always remember that assessments are data-centric. Understanding the impact of failing to protect the confidentiality, integrity, and availability of target data sets directly affects both attacker motive and business impact components of the risk formula. Consequently, the types of controls you do or do not implement, and the time and dollars spent, depend on accurate data classification.
Security teams should not classify data. Rather, department or C-level managers, acting as data owners, should classify all data stored, processed, and passing through an organization. As security practitioners, it is our responsibility to work with Legal, Internal Audit, external auditors, and data owners to help arrive at the right classification levels. Following classification, we have additional responsibilities to assess risk and implement/manage security controls.
We classify Erudio’s invoicing data as restricted. It contains customer PII and is regulated by Sarbanes-Oxley.
Step 2: Threat Identification
As defined in Chapter 1, threats are intentional or unintentional methods or events that might leverage one or more vulnerabilities. This increases the business impact factor in our risk formula (Figure 2-1). Table 2-2 lists common threat sources, motives, and agents/actions. It is not a complete list, but it provides guidance on how to create your own.
For the Erudio assessment, we use the cyber-criminal threat source with a financial motive. The actions assessed are remote and external system intrusion. For a real-world assessment, consider all relevant threat sources, motives, and agents/actions. You can find additional information at US-CERT (http://mcaf.ee/2vs6c) and FEMA (http://mcaf.ee/pwjcl).
Table 2-2: Threat Sources (Stoneburner, Goguen, & Feringa, 2002)
Step 3: Vulnerability Identification
We discover network or system vulnerabilities in three ways: scanning, vendor notifications, and research into known vulnerabilities to the probable threats identified in Step 2. Scanning and vendor notifications are valuable when looking for known vulnerabilities and failure to follow best practices. However, they do not provide a complete picture when determining how likely it is that a specific threat agent/action will be successful; research helps fill some of the gaps. Using both approaches, identify threat/vulnerability pairs, as shown in Table 3.
Note that for each threat source, there can be multiple threat agents/actions. We are using only one in our example assessment. However, the table lists relevant vulnerabilities discovered in this and previous steps. The vulnerabilities in red were identified and confirmed with a combination of scanning, vendor website searches, and research.
The table approach is a good way to demonstrate findings. However, it lacks the depth needed to apply existing controls to determine the actual risk of Table 2-3 vulnerabilities. This is an important point; for example, just because someone publishes a “critical” vulnerability does not mean it is critical in your environment. A review of threat/vulnerability pairs within the context of potential attack paths and existing controls determines your organization’s risk.
Table 2-3: Threat/Vulnerability Pairing
Step 4: Attack Path Controls Assessment
A good tool for a current control analysis is the attack tree. An attack tree is a logical representation of the path, devices, and controls a threat agent/action must traverse on its way to the target. The attack tree for the threat/vulnerability pair in Table 2 is depicted in Figure 2-8.
Before we continue, remember that this is a general approach used to determine the potential impact of existing or missing controls. We are not academics; we are security practitioners. The following calculations are not meant to be precise measurements of probability. Rather they provide a “best guess,” or SWAG, about how easy it is for an attacker to reach a target.
The probability that an agent/action can reach the target is calculated using skilled guesses, information gathered about the assessed threats, and a little probability theory… just a little. Refer to Figure 2-8. The boxes are organized in layers. You might do it differently, but we will use this approach for our example. Within each box you see P=n. “P” represents probability and n the probability that an attacker will crack through the control/device represented by the box as a fraction of 1. 1.0 = 100% probability.
The value of n is largely subjective. There are too many variables to calculate an accurate number. Therefore, we use our experience and information gathered in previous steps to make a scientific guess. Often, consulting with outside experts is necessary to increase the accuracy of one or more probability estimates.
Figure 2-8: Network Attack Tree
For each layer in Figure 2-8, advancement to the next layer either requires a condition set (AND) or a single (OR) condition. For example, a remote hacker using the Web server as a launching platform for his attack must pass through the internal firewall AND compromise the Web server AND pass through the internal firewall. Based on current vulnerabilities, threat source motives, and existing controls, we will estimate probabilities of 1.0 for firewall breach and a probability of 0.5 for Web server compromise. When all conditions must exist to move to the next layer, we calculate probability of success as
P(success) = P(1) * P(0.5) * P(1)
The probability that an external attacker will reach the switch is 0.5, or 50%. If we remove the Web server and assume a direct database connection attempt via TCP Port 1433, the probability of passing through this layer increases to P=1.0.
The internal attacker must gain physical access to the building and to an open, active port. Based on our probability of success calculation, the probability of success is P=0.5. If a rogue employee attempts access, reaching the switch is P=1.0. Likewise, P=1.0 if an employee’s system is owned by the attacker and controlled from a remote location. Although there are no physical controls during the day preventing physical access to a port, we assume that one or two employees are alert enough to ask about a stranger sitting in the conference room alone.
Passing through the switch is P=1.0. No network segments with associated access control lists exist. At this point, we calculate each attack path separately. The remote attack using the Web server carries a P=0.5. Since the attacker must pass through the perimeter and the switch, the calculation is P(success) = P(0.5) * P(1.0). In other words, the probability of getting from an external location to a location beyond the switch is P=0.5. The same probability exists for an internal attack.
We can take two approaches to calculating probability for the final layer before the target: calculate probability for the path through each box or calculate the layer probability and use it in our attack path calculation. To keep things simple, we will use the latter approach.
Our probability calculation changes at this layer. Note that an attacker can decide to crack the operating system or the database instance—or both. When calculating AND conditions, we multiply. When calculating OR conditions, we add. So the probability of passing through the operating system/database layer is P(success) = P(0.6) + P(0.6) = P=(1.0). Although the formula results in a sum of 1.2, a value over 100% is irrelevant.
Using the final layer and previous calculations, the probability of reaching the target from a remote location, using the Web server, is P(success) = P(0.5) * P(1.0) = P(0.5). The probability of an internal attack without physical access is P=(1.0).
Component attack trees
It is also possible to use attack trees to calculate the probability that an attacker will overcome a specific box, or component, in our network attack tree. See Figure 2-9. This is a simple attack tree to determine the probability that an attacker can own the Web server. Using the probability calculations described above, we arrive at P=0.5. The formula is (P(1.0) * P(1.0) * P(0.5)) * P(1.0) = 0.5. This approach applies to any set of tasks an attacker must perform in order to achieve an objective.
Figure 2-9: Component Attack Tree
The on/off approach
Another approach to attack trees is simply labeling each box as possible or impossible. See Figure 2-10. The on/off attack tree works on the same principle as Boolean AND and OR logic. To pass through an AND, all inputs must be on, or “1.” To pass through an OR, only one of the inputs must be on.
We know that in order to take over Erudio’s Web server, all three of the lower level components in Figure 2-10 must be possible. In other words, they have to be on. I created a tree for a Web server with no relevant vulnerabilities. This makes the input to the AND 0,1,1. Consequently, the input to Act Undetected is “0.” We also know from our previous probability example that each layer’s value is ANDed to the previous layer’s ANDed value. In this case, the attacker can act undetected, giving that layer a “1.” The incoming value from the lower layer is “0.” This results in AND 0,1 or a result of “0.” Because an attacker is unable to leverage an existing vulnerability, she will not accomplish her objective.
This approach, also applicable to the higher-level network attack tree, has one major advantage over the probability approach; it is simpler. If the potential target is classified as confidential, it might not make sense to calculate probabilities along the attack path. However, the On/Off approach has one big weakness.
Figure 2-10: On/Off Component Attack Tree
The On/Off attack tree assumes all or nothing. In Figure 2-10, for example, the assessor must believe that a vulnerability will never exist. This requires a vendor that never releases buggy code or a patch management process that results in release-date application of bug fixes applied before zero-day attacks. What is the probability of both conditions being true? In my experience, there is always some probability that an attacker can crack a security layer. We should always plan to some degree for this inevitability.
Step 5: Impact Analysis
In Steps 1 through 4, we assessed current state processes, controls, and attack paths. Our final findings in Step 4 provide the information needed to calculate the impact of a successful attack. With that information, we can create a risk matrix for a final risk evaluation.
Security incidents have negative business impact, no matter how minor. Otherwise, why would we care? The objective in a risk assessment is determining the severity of impact. With this information, management can determine how to react to the calculated risk.
Many variables affect business impact, including (Olzak, 2011)
- Maximum tolerable downtime. What is the longest period over which a business can be without a process, or part of a process, before suffering irreparable damage?
- Impact on employees.
- Impact on investors.
- Impact on customers. Customers do not like mismanagement of the data with which they have entrusted your organization. How will they react if their data is stolen? Further, how long will they wait for recovery of critical processes before they go to the competition?
- Impact on current and future earnings potential.
- Sanctions due to non-compliance with regulatory requirements.
We can take one of two paths when calculating business impact: quantitative or qualitative.
Decision makers like quantitative assessments because the results include hard dollar numbers. This makes it easier to evaluate your cost justification. However, quantitative assessments are time-consuming, and most organizations do not have access to all the financial impact information necessary, including ( (Olzak, 2011(b))
- Product Cost. The purchase price of each technical component of the process.
- Professional services cost. The cost of engaging outside assistance to install, configure, and test each process component.
- Confidentiality. The cost of customer, investor, or government litigation or sanctions for breach of sensitive information.
- Integrity. The aggregate cost of loss of faith in the accuracy of the organization’s data.
- Availability. Losses related to idle employees or idle manufacturing processes.
When this information is available, a quantitative analysis calculates the annualized loss expectancy of a specific threat/vulnerability pair. This loss, ALE, is compared to the annualized cost of risk mitigation. If the ALE is less than the annual mitigation cost, the risk is typically accepted.
The following steps calculate ALE for a threat/vulnerability pair identified for a specific business process or technical component of a process.
- Calculate the aggregate value of relevant technology assets and the processes they support.
- Calculate the expected loss related to a single incident, the single loss expectancy (SLE). SLE is affected by the presence of a documented and practiced incident response plan.
- Calculate the annualized rate of occurrence (ARO). We do not usually expect an incident to occur each year. Rather, a specific event is likely to repeat on a multi-year cycle. For example, we give an incident occurring every year en an ARO of 1.0. If, however, an incident is expected to occur every five years, the ARO is 0.20 (1.0 / 5)
- Multiply the SLE by the ARO to arrive at the ALE. For example, if the SLE of an incident is $100,000 and the ARO is 0.20, the ALE is $2000. In other words, the SLE is the annual cost of a single incident when total cost is spread over five years, or the expected repetition cycle of the incident.
If real numbers are available, this is likely the most accurate method of assessing risk; it integrates probability of occurrence and controls into the process resulting in risk represented by actual dollars. When the organization does not have the numbers, or you do not have time to step through collecting/calculating them, a qualitative assessment is appropriate.
Instead of relying on dollar numbers, qualitative assessments rely on employee knowledge and experience. Knowledge usually consists of research and consultation with vendors, law enforcement, and other resources. A qualitative assessment translates knowledge and experience into a number, as demonstrated in Step 6. Whether you use dollars or other values, always try to estimate impact in terms of the variables and SLE calculation used in quantitative assessments. Understanding possible financial impact is important, whether or not you have actual numbers.
Step 6: Risk Determination
At the beginning of this chapter, I wrote that our Erudio example is based on a qualitative analysis. This means that our probability and control assessments must also translate into representative values. We do this by using a table like that shown in Table 2-4, a modified DREAD (Meier, Mackman, Dunner, Vasiereddy, Escamilla, & Murukan, 2003) table.
When we use the risk calculator in Figure 2-11, we can translate all our work from this and previous steps into an overall risk value. The calculator’s formulas break down as,
Risk = Probability of Occurrence * Business Impact
Risk = (Reproducibility + Exploitability) * (Damage Potential + Affected Users + Discoverability)
The numbers placed in each column are subjective. You might not agree with my interpretation of risk. That is not important. This is simply a mental exercise to help you organize and interpret your current state findings. We now fill our calculator with numbers assigned from our different steps.
- Threat = Remote cyber-criminal.
- Threat Agent/Action = Remote access to customer payment data using various tools and techniques.
- Reproducibility = 3
- Exploitability = 3
- Damage Potential = 3
- Affected Users = 3
- Discoverability = 3
Table 2-4: Modified DREAD Table
Figure 2-11: Risk Calculator
Entering our values in the calculator, we arrive at the result shown in Figure 2-12. We can interpret the final risk value using a table like Table 2-5.
So far, this chapter provides methods to collect current state risk information and various ways to view and assess the results. It is not important if you follow these or your own methods. What is important, however, is that you do assess and intelligently interpret your findings.
Figure 2-12: Calculated Erudio Risk
Table 2-5: Risk Score Interpretation
Step 7: Controls Recommendations
Based on our risk calculation, we must begin immediately to plan for remediation. However, we are in the middle of a budget year. How can we fix this without cash? Well, in the situation Erudio finds itself in, there are usually controls to implement that cost nothing but employee-hours. These are our first steps in risk mitigation. However, they are also part of an overall mitigation plan, incrementally implemented over two or more budget cycles.
Our first objective in Step 7 is to define a future state, as shown in Figure 2-13. Note the addition of intrusion prevention devices, network segmentation, and implementation of outward-facing security zones. Trying to get approval to get the budget necessary in one year is probably not reasonable; other projects are also on the table. However, we can break it down as follows:
- Block all traffic passing through firewalls, and then open only what is absolutely necessary and approved, and TCP Port 1433 is blocked
- Create access control list restricted data center VLANs.
- Bring patch levels up to date and require aggressive oversight of patch management
- Disable all unused network jacks in conferences, rooms, cubicles, and offices
- Modify privileged account password management processes to ensure password policy compliance
- Change the data center lock combination
- First year’s budget
- Create outward facing security zones with no communication between them
- Second year’s budget
- Implement a security information and event management solution to collect and manage logs
- Third year’s budget
- Implement intrusion prevention systems
- Evaluate data-at-rest encryption
Figure 2-13: Future State Network Diagram
If you are not familiar with these concepts, do not worry. We cover them in later chapters.
So, how do our proposed immediate remediation steps affect risk? We start with a revised attack tree. See Figure 2-14. Our estimated probabilities after applying the controls listed under “Now” drop significantly. Using our probability calculations from Step 4, we calculate the success probability of both the internal and external attack paths.
Figure 2-14: Revised Attack Tree
External attack path
The probability of cracking through the perimeter defenses is
P(Internal Firewall) * P(Web server) * P(External Firewall) = P(Perimeter) = 0.0050
The probability of cracking through the perimeter and the switch access control lists is
P(Perimeter) * P(Switch Access) = P(Server Access) = .0005
The probability of cracking either the operating system or database to gain ownership is
( P(OS Config.) + P(DB Config.) ) * P(Server Access) = P(Success) = .0001
The probability of a successful attack along this path is
P(Success) * 100 = .01%
Internal attack path
The probability of an internal attacker using an open port is
P(Open Port) * P(Physical Access) = P(Physical Access) = 0.25
The probability of gaining internal physical access and cracking the switch access control lists is
P(Physical Access) * P(Switch Access) = P(Server Access) = .025
The probability of cracking the operating system or database to gain data ownership is
( P(OS Config) + P(DB Config) ) * P(Server Access) = P(Success) = .005
The probability of a successful attack along the path assessed is
P(Success) * 100 = .5%
Risk calculator – Remote attack path
The modified risk calculation for the remote attack path is shown in Figure 2-15. Using Table 2-4, we score Reproducibility and Exploitability low. The overall risk score drops to 18: the highest value considered low risk. When Erudio implements log management in year 2, Discoverability drops to 1 and Risk to 14.
Figure 2-15: Controls Adjusted Risk Calculator
It did not take a lot of money to significantly reduce risk; it took only properly configuring and managing existing controls. One caveat, however. The results of this step are not foolproof. They depend on our understanding of existing threats, exploitable vulnerabilities, and our security posture remaining static. None of these things is certain. Calculating risk is an ongoing process requiring an open mind and a willingness frequently to adjust your perceptions of residual risk. …and these are just two attack paths of many.
Step 8: Action Plan and Proposal Creation and Presentation
We completed our assessment. Now the hard part; we must convince management to accept our risk assessment, approve our future state, and give us approval to begin incremental remediation. However, they have other options. They can
- decide to do nothing and simply ignore the risk
- accept the risk
- mitigate the risk (our recommendation)
- transfer the risk using insurance, outsourcing, etc.
It is our job to present our findings so that the decision makers understand the risk, especially business impact. Figure 2-16 shows a tool that might help. Managers will probably realize this is not much different from managing financial risk.
Figure 2-16: Risk Decision Flow (Stoneburner, Goguen, & Feringa, 2002, p. 28)
Your presentation should include,
- The decision you are asking the managers to make
- The background on what you did and why
- The results of your assessment, focusing on business impact
- You recommendations for mitigating risk and cost/benefit analysis (include an action plan with expected completion dates, required resources, etc.)
Step 9: Implement Controls
Once you have management approval to proceed, standard project and change management processes take over for the immediate control changes. Unfortunately, changes in year two and after require additional presentations and justification. As business needs change, so do management’s perceptions of the importance of using their finite pool of dollars for something that does not easily relate to productivity enhancement. Further, you should conduct this assessment again based on a planned assessment cycle. Keep your documentation updated and handy.
Step 10: Measure and Adjust
In a world where everything works as expected, this step is not necessary. However, we have not found that place yet. Consequently, we must monitor outcomes we expect from the changes we make.
A good method is paying a third party to conduct annual penetration tests. Test results provide information about the accuracy of your probability values. They also help you improve your ability to assess risk given a set of conditions.
Never assume nothing has changed or that your assessment was 100 percent accurate. Watch, analyze, and modify your outcomes. Again, manage your outcomes, not your controls. Adjust your controls until your outcomes (e.g., reduced attack path probabilities) meet your expectations and those of management.
Security management is risk management. It is a balance between preventing bad things from happening to your organization and its business objectives. Most business managers still do not understand that managing security risk helps them achieve their objectives. We have much work to do.
The foundation of risk management is the risk assessment. We begin with an understanding of the business and the supporting information assets. Identifying threat/vulnerability pairs and the paths they potentially follow is next, including an analysis of existing controls. Finally, we calculate risk.
An attack scenario with high risk of success requires immediate remediation. Identification of future state allows us to build an incremental implementation project plan and proposal for management. Once implemented, controls require frequent measurement and monitoring of outcomes.
Manadhata, P. K., Karabulat, Y., & Wing, J. M. (n.d.). Report: Measuring the attack surfaces of enterprise software. Retrieved December 29, 2011, from Carnegie Mellon: School of Computer Science: http://www.cs.cmu.edu/~wing/publications/ManadhataKarabulutWing08.pdf
Meier, J. D., Mackman, A., Dunner, M., Vasiereddy, S., Escamilla, R., & Murukan, A. (2003, June). Chapter 3: Theat Modeling. Retrieved December 16, 2011, from Microsoft MSDN: http://msdn.microsoft.com/en-us/library/Aa302419
Olzak, T. (2008, February). A Practical Approach to Managing Information System Risk. Retrieved November 9, 2011, from Tom Olzak on Security: http://adventuresinsecurity.com/Papers/Practical_Approach_IS_Risk_p.pdf
Olzak, T. (2011(b), June). Manage the Enterprise Attack Surface. Retrieved December 29, 2011, from CBS Interactive: http://www.techrepublic.com/downloads/manage-the-enterprise-attack-surface/2949257
Olzak, T. (2011, December). Risk Management. CBS Interactive/TechRepublic.
Stoneburner, G., Goguen, A., & Feringa, A. (2002, July). Risk Management Guide for Informaiton Technology Systems, NIST Special Publication 800-30. Retrieved November 9, 2011, from http://csrc.nist.gov/publications/nistpubs/800-30/sp800-30.pdf