In the information security space, innovation is the name of the game. When your adversary ups her game, you must take steps to improve yours too. But according to security experts, today’s cyber threats have grown so sophisticated that security professionals must develop data science and data engineering skills if they’re going to protect our digital byways.
John Omernik didn’t start his career as a big data security expert. With a degree in computer science, he had several jobs providing general IT and computer support. Gradually he started developing credentials as a security engineer, and he plied his trade at several banks, including Bank of America, Zion Bankcorporation, and Associated Bank.
At some point along the way, Omernik realized that the job description for security engineer had changed. It was becoming too hard to detect the bad guys using traditional security tools, and the bad guys knew it. In order to protect his clients from rapidly evolving threats, Omernik needed to up his game.
So he started working with data science tools and using data science techniques to try to stay one step ahead of the bad guys. He worked with Python data science notebooks, used Kafka to stream security data, built security data warehouses in Hive, used Spark to build security models, and used visual analytics to detect anomalies.
“Basically what I started on in 2010 was a journey where as a security practitioner, the more I learned about big data, data science, machine learning, artificial intelligence, I realized that for information security practitioner to stay relevant in this field, we need to start adopting those skill sets,” he said.
There are multiple parts to this evolution. First, the siloed nature of traditional security products, which store their own data and don’t easily integrate with others, is a recipe for missing critical clues left by cyber attackers.
“We need better data platforms,” said Omernik, who started working at MapR Technologies as a distinguished technologist earlier this year. “There’s value to be found by mashing up some of these different data sets within in the organization.”
Security pros should stop thinking about “security data” as a separate entity from the organization’s regular data,, Omernik says. “We are holistically looking at data, not just security or enterprise data — it’s just data,” he told Datanami. “I’m a huge proponent of making data access easier for the practitioner.”
Being able to mix different data sets is critical to detecting bad guys. For example, if a security professional wants to run some transactional banking data against logs generated by the intrusion detection system to find interesting correlations that could indicate a hack, the data platform should help, not hinder, that process.
“So if I have a security event and information management [SIEM] product and I want to load transactional data into that, there might be a lot of consternation from our enterprise data warehouse group about moving the data to a security platform that they don’t know how it’s being, stored, managed, secured, encrypted,” Omernik said. “And then the SIEM product might have licensing rules or restrictions or how much data can be put in there. Then silos start forming.”
The modern security professional should also have a fair share of data science skills, too. That means being able to pick out the relevant features from a sample data set, build a machine learning model that can automatically detect those features in the real world, and deploying that model into production using the latest technologies, like Spark, TensorFlow, Docker, and Kubernetes.
“Information security practitioners need to learn data science skills,” Omernik stated unequivocally. “They need to huddle themselves and say, I may be a fantastic security practitioner, but I’m going to have to slow down and learn this new thing that I never thought I’d have to learn when I was starting my career.”
Considering the size of the data sets, the advanced curation and governance needs, and rapid pace of innovation in the security world, the best security pros will know their share of data engineering too.
“So you get that list of CSV or JSON files. The information security practitioner need to be able to use that data right away, not two weeks after it’s gone through an ETL process because that threat exists today, not two weeks from now when an ETL is complete,” he said.
At the end of the day, the best information security professionals are jacks of all trades. They may not be masters of every data-oriented skill, but they know enough to get around. Omernik recommends that security pros keep this in mind when choosing their career path.
“They might have skills a data scientist, a data engineer, and a data operator might have, so they can talk successfully with the data operations team or data science team,” he said. “They can do some of the work themselves, they can clearly articulate what they need out of the data platform, and they can judge the efficacy of what the data platform is doing for their organization.”
There are a lot of different ways that threats can attack an organization, particularly in the financial industry. By learning the skills of the data scientist – and the data engineer, too – a security pro can more effectively repel that attack.