Multimedia Computing and Computer Vision Lab












Student Theses


Source Code / Datasets




WS 16/17: Multimedia Projekt

From Multimedia Computing Lab - University of Augsburg

Registration is now open at

File: Dalal_dense_detect.PNG Model: File: Project_detect.PNG

Left two images taken from [1].


Instructors: Anton Winschel, Prof. Dr. Rainer Lienhart
Time Lecture: Tuesday, 10:00-13:45, Room 1020N. Starts October 18th.
Time Exercise: Tuesday, 10:00-13:45, Room 1020N.
Credits: 6 SWS, 10 LP
Language: German


  • [23.09.2016] Page creation


The topic of this course is the detection of humans in images.

Object detection is one of the most challenging tasks in the field of computer vision and machine learning. The difficulty is because many objects have complex appearances; for instance, humans often adopt varying poses, and have different sizes.

The goal of this project is the detection of object instances in images using local features and supervised learning methods. The students will implement a detector for humans which performs localization by specifying a tight bounding box around each instance.

The project will include the following main tasks

  • Extract HOG (Histogram of Oriented Gradients) [2] image features.
  • Learn basics about Support Vector Machines and how to use them for classification and regression.
  • Evaluation of detection results on images of a pedestrian dataset.

This course is divided into two phases:

  • Assignments (handed out every week) will introduce students to programming in OpenCV step-by-step and provide first hands-on experience in image processing and machine learning.
  • Student groups will work on a project in weekly sessions.


  • Participation in the exercises to prepare for the project
  • Each student must read the related papers thoroughly
  • Teamwork
  • Written documentation of the project (can be done in Word, LaTeX, OpenOffice, etc...)
  • Short presentation of the results
  • Programming will mostly be in C/C++
  • Code optimizations may be performed by Intel's IPP and TBB


The course is held in German. You can write the documentation either in German or English, whatever you prefer most.


  1. Finding People in Images and Videos, Navneet Dalal. PhD Thesis. Institut National Polytechnique de Grenoble / INRIA Grenoble , Grenoble, July 2006.
  2. Histograms of Oriented Gradients for Human Detection, Navneet Dalal and Bill Triggs. In Proceedings of IEEE Conference Computer Vision and Pattern Recognition, San Diego, USA, pages 886 - 893, June 2005.