Real-Time Data Enrichment
An Advanced Security Analytics platform that leverages billions of signals – internal and external events, whether it is raw network traffic, raw packet-capture, end-point data, VPN, Proxy, firewall logs, SIEM data, structured data, or log files, ingesting all this data and analyzing it in real-time with context to the data by enriching it with dimensions such as geo-IP location, geo-coordinate specifics, who owns that block of IP, IP vs threat-intel-feeds to see whether that particular IP is malicious or not, reputation of that particular IP etc. Sound like a leap of faith? Not really.
With the advancement of AI and ML technologies, this is very much a reality to counter and surpasses today’s complex cyber challenges. A Risk Score is provided, known and unknown threats are identified based on input data, and extracted based on behavioral analysis.
Fidelity
All this data ingestion is of zero value, unless one can assure absolutely no loss of data, and integrity is maintained to do a proper exploration. In other words, fidelity is a must! The need to be able to present cohesively to the Security analyst in one place on the screen to investigate and follow process internally of whatever problem they are investigating is most important!
Context is King!
If one doesn’t have context at the time when the event is happening or when examining an incident, gaps remain in one’s story. Precious time and efforts are lost. And that’s one more reason why enterprise customers continue to get hacked!
Software platform solutions (not appliance-based) can be installed on a bare-metal machine or in a virtualized environment, and on-premise, hybrid or on the cloud – from an implementation of hardware and software perspective.
The product should not summarize the data. For example, 50 DNS requests should be stored as 50 individual requests, and not by aggregating them.
Advanced Security Analytics Processing
Enriched data is ingested in a real-time stream-processing engine which is distributed across many different machines. There is a “known threats” rules engine which results in signals that the SOC team can investigate. In addition, the SOC analysts can build their own rules, and those can be processed and stored in real time (in Elastic Search and Hadoop). It is used for querying for any anomaly or inconsistency. A subset of the data (features extracted from ML Jobs) is used and stored in Hadoop, where ML jobs are run to discover unknown threats. Analyzing base behavior of assets on the network, any deviation from the baseline gets flagged and associated with a Risk Score, and is exposed to SOC analyst on the dashboard. Secondary analysis describes the anomaly, and the reasons as to what events led to it, why it is anomalous, and then it is classified to reduce false positives. Feedback is obtained to reduce false positives, which goes into a Case database. Alerts go into the dashboard and the investigation starts.
Concerns
Organizations would naturally be curious to know where this is installed, is my data being taken, is it being moved to the cloud or are you exposing my security.
The simple answer is that it is installed within the security perimeter of the customer’s enterprise. On their own premise, within their own cloud environment or within their perimeter. Data does not leave the premise of the customer environment.
Real-Time Analytics
To summarize, the following functions are performed in real-time:
- Detection of known threats, utilizing a Rules Engine
- Data enrichment
- Query engine that permits the Security analyst to query the databases on what they’re looking for
- Reports -> some pre-built and some customized
- Threat hunting
- Asset mapping
Real-time Asset Mapping
This is an imperative process and includes reading the data Source IP address, connecting to a Destination IP address. The source IP is enhanced with Active Directory information, and the IP is mapped to that particular machine. So when the analysts are looking at alerts, they know that particular IP has a specific machine name when they’re looking at it in real-time (associating when the event is actually happening). Since machine IP changes are dynamic (DHCP), real-time mapping is very important for linking machine to IP address to happen. Also, the log in and log off to that IP/machine, and the user logged in at the time the event is happening is mapped. Real-time mapping of unknown threats is based on behavioral analytics, and provides deep network visibility.
Real-time vs. Batch Analytics
SIEM solutions are mainly used as tools that create alerts. The challenge is that the Analytics run in batch mode. In order to investigate something, SOC analyst has to run a query, pick data and start enriching it only at the time when the query is sent.That takes time. Real-time enrichment while ingesting the data is a big differentiator. It is always available for querying, and it minimizes the time to get results.
Scalability, Versatility, Productivity
Any such solution needs to be scalable. Why is this important? When customers install, their requirements change over time. There are more data sources to ingest, different divisions asking for different business requirements. Scalability is very important. Software platforms can help scale horizontally, no matter what and how many data sources, without breaking the architecture. Scale is attained by spinning additional hardware to accommodate additional ingestion points and additional data to ingest and analyze. Deployment becomes versatile – On premise, on the cloud or in a hybrid environment.
It improves productivity of the Security Department within any org. Customers have the ability for its Level 3 engineers to focus on important tasks they are doing, and build a play-book for their Level 1 and Level 2 to follow standard operating procedures (SOPs) of investigation before escalating to the next higher level.
SIEM Augmentation
Augmenting existing SIEM product solutions is an advantage of this technology. Since real-time enrichment at the time of ingestion is occurring, it is an upgrade to these existing platforms to be able to perform further analysis. One can query the attributes stored in the SIEM, and end-point data coming into it. So there is no need to ingest end-point data again into the Advanced Security Analytics platform. If we want to get this endpoint data for a particular asset, the SOC analyst can click a button, and it can retrieve attributes from the SIEM database and tag it to whichever asset is being investigated for a holistic view of that asset, and provide a rich visualization layer. This results in quick identification of threats and it should integrate easily with Phantom or other orchestration platforms. It provides visibility to all network events, and stores data in the SIEM without duplication.
Ideally, it can be used as a Threat Hunting Platform to augment what an organization may already have in place. Within their own environment, the SIEM database can be queried. One can go to the specific asset icon, click on it, and get the context menu. There is an entry that says, “Query SIEM for end-point data”. When integrated with the SIEM API, it pulls the information and enriches the asset information.
The value proposition for customers is that there is no duplication required in a separate system for querying and incurring storage cost and compute. A loosely coupled architecture allows one to take system network traffic into the platform and query that and any other data in a separate database from a separate system e.g. endpoint traffic, without any duplication. Hence this is an augmentation of their implementation, and not a rip and replace.
In conclusion, it is these clear differentiators that create a significant RoI for organizations by using a solution that provides holistic network and endpoint security across the enterprise.