Data Science for Security and Forensics
Study plans 2016-2017
- 7.5 ECTS
On the basis of
BSc level basics in statistics and mathematics, i.e. expected prior-knowledge in understanding basic statistical methods like descriptive statistics, probability, sampling distributions, and hypothesis testing, as well as basic analysis and matrix algebra.
Expected learning outcomes
- Understand principles how multidimensional statistical methods differ from one dimensional methods.
- Understand the distribution of information in statistical analysis and meaning in data representation.
- Extract features from raw, measured values of data to be analyzed.
- Program some basic classification and clustering methods and test their validity.
- Program some basic Neural networks methods and test their validity.
- To apply basic statistical and data analysis methods to data relevant in information security, forensics and/or color/media technology
- The students can use relevant scientific methods in independent research and development in machine learning and pattern recognition.
- The students are capable of carrying out an independent limited research or development project in machine learning and pattern recognition under supervision, following the applicable ethical rules.
- The students can work independently and are familiar with terminology of machine learning and pattern recognition as well as their application in the security and forensics domain.
- Learning, Intelligence, and Machine learning basics: principles, measures, performance evaluation, method combinations.
- Knowledge representations: discriminant and regression functions, probability distributions, Bayesian classifier.
- Learning as search: Exhaustive search, heuristic search, genetic algorithms.
- Attribute quality measures: measures for classification, measures for regression, application of feature-selection measures.
- Data preprocessing: Principal Component Analysis (PCA), Linear Discriminant Analysis (LDA).
- Supervised symbolic and statistical learning, basics of artificial neural networks.
- Unsupervised Learning and cluster analysis: hierarchical and partial clustering.
- Data classification: Bayesian classifier, k-NN classifier, multi-layered perceptron (MBPN), support vector machine (SVM), and Random Forrest.
- Data clustering: k-means clustering, Self-Organizing map (SOM).
- Classification and clustering validity testing: leave-one-out, ground truth.
- Practical tasks may include:
- Realize some search methods
- Realize some classification methods
- Realize some clustering methods
Net Support Learning
Teaching Methods (additional text)
4 major assignments that include theoretical and practical aspects of the topics (graded)
Form(s) of Assessment
Written exam, 3 hours
Form(s) of Assessment (additional text)
- Written exam (60%)
- 4 major assignments (40% total, 10% each)
- The written exam and all major assignments must be passed
Alphabetical Scale, A(best) – F (fail)
Internal examiner on the assignments, both internal and external examiner on the written exam.
For the written exam: Ordinary re-sit examination in August. The major assignments, if passed, need not be re-submitted.
Books/standards, conference/journal papers and web resources, such as:
- Kononenko, M. Kukar, Machine Learning and Data Mining: Introduction to Principles and Algorithms, Horwood Publishing, Chichester, U.K., 2007, ISBN 1-904275-21-4
Recommended further reading:
- T. Mitchell, Machine Learning, McGraw Hill, 1997.
- R.O.Duda, P.E. Hart, and D.G. Stork: Pattern Classification. 2nd edition., Wiley, 2001.
- S. Theodoridis, and K. Koutroumbas. Pattern Recognition, 3rd edition. Academic Press.
Replacement course for
IMT4612 Machine learning and pattern recognition
The course will be made accessible for both campus and remote students. Every student is free to choose the pedagogic arrangement form that is best fitted for her/his own requirement. The lectures in the course will be given on campus and are open for both categories of students. All the lectures will also be available on Internet through the learning management system.