Guideline to develop and maintain the security operation center (SOC)
This article is written to explain the strategy, planning, execution in building and maintaining of SOC. The aspect of building SOC is complex, so many things are involved. I tried to cover the basic points which a SOC manager can understand and implement it properly.
I have tried to cover most of the areas which a Team Leader or SOC Manager will use to initiate the process to build a proper SOC.
FREE role-guided training plans
FREE role-guided training plans
In this article, I will cover the roadmap of SOC, planning and building strategies, developing Use Cases, team building planning, SOP (Standard Operating Process), SOW (Scope of Work) and responsibilities of individual members in SOC.
I will cover SIEM related aspects like POC checklist, Architecture and many things in upcoming articles. If you have specific requirements to understand, please comment in comment box.
Q. What is SOC?
- SOC stands for Security Operation Center.
- It is Command Center of Highly Qualified and Talented Ethical Hackers/Security Analyst whose primary aim is to monitor the SIEM Console continuously and detect the security incidents, report, escalate and close with proper justification and cause.
SOC Team works throughout 24/7 shift to detect the intrusive/malicious/suspicious/misconfiguration/policy violation etc. events.
To build a SOC team, SOC Manager has to be clear with SOC Roadmap, which consists of the following factors:
Once SOC Manager identifies the requirements what he need, what is require to secure the organization from attacks, he can map it with 3 Roadmap stated above. We will learn it in detail.
Before we start, let's understand the need of SOC in Every corporate company.
Q. Need of SOC?
Security Operation Center is the main portion of Information technology and information security in all corporate sectors.
Now a day we all know how Cyber attackers are impacting the financial growth of corporate environment especially financial institution.
From Leaking the financial and sensitive information of customer data to impacting the daily functioning of IT infra by performing attacks like DDoS.
For Bad Hackers Nothing is impossible, every day they invent a way to exploit and bypass techniques and spread it over the internet, but SOC team make it difficult somehow and try to secure the corporate infrastructure.
Simply to avoid the situation where financial loss and reputation involved. Also to avoid Data/Info breaches and attacks that have high criticality need to focus on incident detection, investigation, and proper closure capabilities.
Let's start with detailed learning of Roadmaps.
It is the perfect approach to proceed with building a SOC Team.
The need of roadmap is because you cannot build a perfect and secure SOC which work on 24*7 a year irrespective of how you invest. What to do next? How? Always the question.
You can consider it as planning to move ahead incrementally towards progress and complete the milestones that may result in better security of the organization.
Following three triage of Security Operations:
- People (Team member)
People aka Team members are the main link of security operation with specialized skills.
As per skill levels responsibilities, i.e. SOP get assigned.
SOC Manager should consider following levels with skilled while building a SOC:
Levels starting from
- L1: Monitoring Analyst.
- L2: Sr. Analyst.
- SIRT Analyst: Experienced in Analysis and investigation.
- Process Consultant (Optional).
- SOC Manager (Team Leader).
The SOC Manager always recruits the Team members as per Job Description and scope of work. I have summarized the details what skills, duties, and other qualities require within Team is as follows:
L1 Analyst: (Per Shift at least 2)
L1 Analyst should have Ethical Hacking knowledge. CEH certification is entry level certification into the SOC. CCNA/MCSA and Other Certification are an add-on for him. If the individual has a strong knowledge of hacking and analysis should be hired.
L1 Analyst monitoring SIEM console is totally responsible for identifying the Security incidents and reporting. The scope of L1 is to monitor the SIEM console continuously (24/7) and report triggered incidents to Owner and L2 Analyst to bring it to closure.
SOC Manager has to arrange training on SIEM administration, integration, and monitoring for L1 as well as all team member to become an expert and don't face problem while handling the issue.
L2 Analyst: (Per shift at least 1 or 2)
The L2 analyst is skilled and more experience in monitoring and analysis compared to L1. Pay scale also more as compared to L1.
Main Scope of work is to respond and closure the Security Incidents created on a daily basis by L1 within defined SLA (Service Level Agreement). Daily follow up of opened cases with Owners, discussion over call and resolution of cases.
The incidents which have a high priority or he thinks needs a deeper investigation will be handed over to SIRT Analyst for immediate response and closure.
Correlation rule enhancement, simulation of attack to verify the functionality of the rule.
SIEM Administration, Coordination with support also part of scope.
SIRT Analyst: (Per shift at least 1)
The SIRT Analyst is more skilled than L1 and equivalent or higher than L2. He should be capable of investigating compromised events, threat-related events, internal leakage, identification of internal resource who performed the malicious activities and much more. Network forensics, Deep packet inspection are the added skills which are required.
Pay scale is more comparable to L1 and L2.
Responsible for handling high priority cases which are escalated by L2 or from internal Team, Higher Management, External resources.
Once a Security incident is assigned to SIRT Analyst he has to start the deep investigation. Also responsible for the proper closure of the security incident with RCA (Root Cause Analysis). Post closure he should prepare the detailed incident closure report and submit to Team and SOC Manager to present to senior management (CISO) in the daily or weekly meeting.
Overall the SIRT Analyst is responsible for proper response and closure of High and Critical Security incidents within SLA.
SOC Manager: (General Shift and offline support to team).
SOC Manager should have more experience in technical as well as on Managerial Side.
More skilled in technical aspect is added advantage to him as well as for Team.
His Overall experience is more and responsible for appreciation, violation, and escalation of SOC Team. He is a single point of contact between the team and senior management (CISO and other Dept. Head).
He should be good supportive and interactive.
He should be good in understanding each Teammate and his contribution.
- Overall Manager has more responsibilities from managing the team as well as senior management.
- SOC Process definition (applicable where process consultant is not part of the team).
- Preparation and validation of the daily, weekly and monthly report which showcase the SOC Team contribution towards the security of organizations.
- Handling multiple projects, understanding the requirement of new technologies and implementation. Doing knowledge transfer within the team.
- If Manager is handling MSSP services, then he has to understand the actual requirement of Client and work accordingly. Keep the customer happy is the main challenge for SOC Manager and Team.
- BCP (Business Continuity Planning) and DR (Disaster Recovery) setup, maintenance and proper execution.
- Immediate support to team and escalation if no proper response received from vendor/asset owner etc.
- Arrange training, seminars for the team to get more expertise and skilled.
- Prepare and publish the monthly roster so in all shift defined number of analyst should be present.
- Personal discussion and approval of yearly appraisal for all team.
- Arrange an outing for the team once in a year.
- Many more responsibilities
Important Note to SOC Manager:
To build a good and stable team he has to be sure to recruit the skilled engineer with Good Pay Scale and understand each member personally and solve the problems (if any).
The process is the main part of Daily SOC Operation.
Defining incident management to investigation process standardize the action. A SOC Analyst will take and ensure no tasks will get failed.
From L1 Analyst à L2 Analyst à SIRT Analystà SOC Manager each one of SOW (scope of work) and SOP (Standard Operation Procedure) is need to define very well as a part of the process.
The number of processes and procedures of a SOC identified and based on company/customer's policies and requirements, services offered, the scope of work and technologies used.
Very Well organized SOC having tons of Policies and processes in place. At minimum baseline, the following are the processes for references. (Examples)
- SIEM monitoring and Notification (email, mobile, chat, etc.) procedure.
- Event management process.
- Security Incident Ticket management process (how to use ticketing systems e.g.: HPSM etc.)
- Incident Handling, Reporting and Escalation process.
- Daily activities process like checklist and handover.
- Daily, weekly and monthly report format to Management
- Compliance monitoring process.
- Incident analysis and investigation response process.
- New technology operating process.
All the Policies and Processes should be properly documented and kept at WIKI Portal, share point or share drive for team reference.
Core technologies such as SIEM having multiple solutions like data (raw log/packets) collection, aggregation, normalization, detection, and analytics is a secret of the effectiveness of SOC.
Security devices (AV, HIPS, NIPS, DLP, WAF, ARBOR, FireEye, DAM, etc.) are more in number, and monitoring for individual intrusive detection at each tool is not possible and increase the count of manpower.
SIEM comes into the picture to overcome the problems.
For perfect Monitoring, SIEM (Security Information and Event Management) is all about one technology used in SOC. SIEM collects raw logs from multiple log sources like desktop, laptop, mobiles, servers, network (infra + Telco) as well as from Security devices convert it into logical security events and populate on SIEM console for further process of raising Security incidents.
Based on defined correlation rule SIEM triggers the events.
Just for understanding, I have added SIEM Architecture for reference:
For more info, refer link: https://community.mcafee.com/docs/DOC-6207
Intel Nitro SIEM Architecture
ESM: Enterprise Security Manager. It is the brain of SIEM solution. The web interface of SIEM.
ACE: Advanced Correlation Engine
DEM: Database Event Monitor
ADM: Application Data Monitor
Receiver: event receiver. Intermediator between log source and ESM. Used for the collection of 3rd party raw logs.
Pull/push/agent method is supported by the receiver to collect the raw logs.
ELM: Enterprise Log Manager
CLMS: Centralized Log Management Solution (required at ISP level where storage of huge logs from lakhs of devices required)
Storage: actual raw events stored here. It may be DAS, NAS, CIFS, SAN, iSCSI as per storage requirement.
The manager should recruit team members who have the following areas of knowledge and expertise.
Duties and skillset of SOC member
- SIEM Implementation, Administration, Integration.
- SIEM Proactive/Reactive Monitoring and incident handling.
- Security event management.
- Threat Management, Malware analysis (reverse engineering).
- Vulnerability management
- Network as well as Web Pen-test.
- Internal and External security devices management
- Managed customer services
As per industry standard and compliance the SOP changes and little change in SOW. Always SOC Team has other multiple responsibilities which they need to execute without failure like:
Monitoring of DDoS real-time Alert in Console (Arbor DDoS Tool): Experts suggested for individual monitoring to get better clarity and more details which SIEM logs never provide.
IP-Abuse and Copyright Infringement: This is a very crucial process where company registered Public IP perform the malicious and unethical activity over the internet.
- Cyber Cell sent mail to get the details to proceed to resolve the criminal cases.
- 3rd party vendor report to inform the source IP is performing malicious activity on client infra; multiple recurrences will blacklist the IP.
Proxy Monitoring: Web Admin collects the Proxy logs and creates the portal. The analyst monitors the logs and report who violated the defined InfoSec policies like visiting the restricted sites, accessing proxy sites, downloading copyrighted materials, watching videos, etc.
Anti-Trojan Activity: Financial (Bank) corporation follow it. Whenever Trojan collects the bank login credentials from client laptop/desktop, there is 3rd party vendor who collects the Trojan, dump the credentials. Report it to individual subscribed financial organization then SOC analyst proceeds with reporting mail to respective departments.
Brand Abuse: Bad actor always makes a way to collect Personal information (PI) of people by multiple ways like recharge or discount, fake app, fake links, phishing pages, etc. to earn money by selling info to other marketing companies. This impacts the company reputation and may loss of genuine customers. Reporting to Legal Department and making it down on priority ASAP.
Study the company environment/client side
Understanding the environment is the first phase of start-up of SOC setup. Complete visibility of Network architecture and asset is required to understand the incident scenario and SME to escalate the issue.
This visibility is useful to meet the defined the SLA and overall customer or in-house service within SOC.
SIEM architect builds the SIEM setup post better understanding of Network architecture and network flow.
Individual Security Devices Read-only access is required to continue the operation process. In the absence of SIEM, individual monitoring is the solution for SOC.
- AV Monitoring access.
- Network and Host IDS/IPS monitoring access
- DLP console monitoring access (Symantec DLP)
- Centralized log storage access (CLMS, Syslog server)
- Email and SPAM gateway Access.
- Web Gateway (TMG / Bluecoat)
- Threat Management tools access (FireEye / Invincea)
- Firewall (Fortigate / Cisco)
- Vulnerability assessment tool (Nessus, Metasploit)
Perfect and effective functioning of SOC Use Cases performs an important role. Use Case is scenario based attack simulation and detection in SIEM console. Based on triggered events creation of correlation rule is the main scope of Use Cases.
As per Hacking scenario, SME will create rule logic, based on triggered security devices logs.
E.g.: SQL Injection attack detection.
Kali Linux is used for performing an attack, Firewall, Nips, WAF and Web Server logs analyst can refer to get the visibility and understanding of attack pattern. Based on triggered events he proceeds to create and confirm the effectiveness of the Correlation rule.
Examples of Use Cases useful while setting up the SOC:
- Security Risk Found (AV).
- Suspicious Outbound Communication observed towards GTI IP (Firewall-known as well as on unknown ports).
- Excessive incoming connection observed towards web server (TCP/ UDP/ ICMP): NIPS
- DDoS attempt- Firewall/Arbor etc.
- Malicious Binary execution observed (from HIPS): NetCat, Mimi Katz, exploits (Dirty Cow) for privilege escalation, etc.
- Multiple logins from different locations.
- Logs deleted from the source.
- User created and deleted within 1 hour (AD).
- Successful login posts multiple failures observed (Successful Brute Force attack observed).
- Huge count of teardown connection observed (Firewall).
- Network/Port Scanning observed (Firewall).
There are a number of rules which can be tested and deployed on SIEM for monitoring by the analyst.
Developing the SOC is challenging as there are so many milestones which SOC manager has to noted down and implement properly. Post to that maintaining is an ongoing process. Daily basis improvement and changes, policy and process definition, is continuous.
SOC Setup and implementation is Challenge but not hard if SOC Manager is on Correct Way and execute the defined milestone on correct time with the help of Team.
Note: It is not possible to cover all aspects in one article. Feel free comment in case any query.
What should you learn next?
What should you learn next?