This talk was accepted for DEFCON 2015 in Las Vegas later this month. However, for those interested here in the UMBC community, John will conduct an informal preview of his talk on Friday 7/17 at 3:00PM in ITE 366 (DREAM Lab).
"Quantum" Classification of Malware
John Seymour, UMBC
3:00pm Friday 17 July 2015, ITE 366
Quantum computation has recently become an important area for security research, with its applications to factoring large numbers and secure communication. In practice, only one company (D-Wave) has claimed to create a quantum computer which can solve relatively hard problems, and that claim has been met with much skepticism. Regardless of whether it is using quantum effects for computation or not, the D-Wave architecture cannot run the standard quantum algorithms, such as Grover’s and Shor’s. The D-Wave architecture is instead purported to be useful for machine learning and for heuristically solving NP-Complete problems.
We'll show why the D-Wave and the machine learning problem for malware classification seem especially suited for each other. We also explain how to translate the classification problem for malicious executables into an optimization problem which a D-Wave machine can solve. Specifically, using a 512-qubit D-Wave Two processor, we show that a minimalist malware classifier, with cross-validation accuracy comparable to standard machine learning algorithms, can be created. However, even such a minimalist classifier incurs a surprising level of overhead.
John Seymour is a Ph.D. student at the University of Maryland, Baltimore County, where he performs research at the intersection of machine learning and information security. He's mostly interested in avoiding and helping others avoid some of the major pitfalls in machine learning, especially in dataset preparation (seriously, do people still use malware datasets from 1998?) In 2014, he completed his Master’s thesis on the subject of quantum computation applied to malware analysis. He currently works at CyberPoint International, a company which performs network and host-based machine learning, located in Baltimore, MD.