Software Similarity and Classification
Software similarity and classification is an emerging topic with wide applications. It is applicable to the areas of malware detection, software theft detection, plagiarism detection, and software clone detection. Extracting program features, processing those features into suitable representations, and constructing distance metrics to define similarity and dissimilarity are the key methods to identify software variants, clones, derivatives, and classes of software. Software Similarity and Classification reviews the literature of those core concepts, in addition to relevant literature in each application and demonstrates that considering these applied problems as a similarity and classification problem enables techniques to be shared between areas. Additionally, the authors present in-depth case studies using the software similarity and classification techniques developed throughout the book.
What people are saying - Write a review
We haven't found any reviews in the usual places.
2 Taxonomy of Program Features
3 Program Transformations and Obfuscations
4 Formal Methods of Program Analysis
5 Static Analysis of Binaries
6 Dynamic Analysis
7 Feature Extraction
8 Software Birthmark Similarity
9 Software Similarity Searching and Classification
11 Future Trends and Conclusion
Other editions - View all
abstract syntax tree algorithms Antivirus API calls applications approach automated axiomatic semantics basic block behaviour birthmarks Branch inversion byte call graph Cesare Chapter Code clone detection code packing Computer Science control flow graph data flow analysis decompilation disassembly Dynamic analysis edit distance embedded feature extraction feature vector g1 and g2 identify indirect branch intermediate representation Jaccard Index Keywords Levenshtein distance Lexical analysis loop malicious malware malware classification malware detection n-grams node Ó The Author(s obfuscation object code object file format opaque predicate opcode operands optimisations Paper presented Parsing perform Plagiarism detection polymorphic problem procedure program analysis program features proposed Raw Code reaching definitions Runtime sequence set of vectors Similarity and Classification similarity search software classification software similarity software theft detection source code SpringerBriefs in Computer stack pointer static analysis string structure syntactic target techniques typically unpacking variables virtualization whole system emulator Xiang