Research

This page is dedicated to the research of me and my group. My academic profile including a CV and a list of grants (projects) can also be found via my ZHAW profile page. A preserved version of my research website at the University of Marburg can be found at my Phillips University profile page.

TOC

  1. The group
  2. Recent work
  3. Collaborations
  4. Publications

The group

Team The AI/ML research team at ZHAW Information Engineering group, 2018: Frank-Peter Schilling, Mohammadreza Amirian, Thilo Stadelmann, Ismail Elezi, Lukas Tuggener, Katharina Rombach, and Daniel Neururer (missing on the picture: Stefan Huschauer).

My AI/ML team performs pattern recognition research, working on a wide variety of tasks on image, audio or generally “signal” data. As a group, we focus on deep learning and reinforcement learning methodology, inspired by biological learning. Each task that we study has its own learning target (e.g., detection, classification, clustering, segmentation, novelty detection, control) and use case (e.g., predictive maintenance, speaker diarization for multimedia indexing, document analysis, optical music recognition, computer vision for industrial quality control, automated machine learning, deep reinforcement learning for automated game play or building control), which in turn sheds light on different aspects of the learning process.

The group has very diverse backgrounds, which helps us complement each other’s skills:

  • Prof. Dr. Thilo Stadelmann, team lead: Diplom (THM) & PhD (U. Marburg) in computer science; I come from the tradition of the field of artificial intelligence, and my tool is machine learning. My research interest is to discover and understand how smart behaviour can be evoked through self-learning, including practical (robustness, generalization) and ethical (algorithmic bias) side effects.

  • Lukas Tuggener, PhD student: B.Sc. (ZHAW) in industrial engineering, M.Sc. (ETH) in statistics, doctorate in conjunction with Juergen Schmidhuber of the University of Lugano / Switzerland

  • Mohammadreza Amirian, PhD student: B.Sc. (U. Shiraz) & M.Sc. (U. Ulm) in electrical engineering, doctorate in conjunction with Friedhelm Schwenker of Ulm University / Germany

  • Stefan Huschauer, M.Sc. student: B.Sc. computer science (ETH & ZHAW), focussing on computer vision

  • Daniel Neururer, M.Sc. student: B.Sc. in systems engineering (ZHAW)

  • Dr. Frank-Peter Schilling, senior researcher: Diplom & PhD (U. Heidelberg) in physics, PostDoc at CERN

  • Claude Lehmann, M.Sc. student: B.Sc. computer science (ZHAW), focusing on deep learning

  • Dr. Ricardo Chavarriaga, senior researcher & head of CLAIRE office Zurich: B.Sc. electronics engineering (Pontificia Universidad Javeriana, Columbia), graduate school in computer/communication/information science (EPFL), PhD neuroscience (EPFL), Postdoc neuroscience/brain-computer interfaces (EPFL & IDIAP)

  • Dr. Javier Montoya, senior researcher: B.Sc. computer engineering (UCSP, Arequipa, Peru), M.Sc. computer science (Unicamp, Sao Paulo, Brazil), M.Sc. computer science (ENSIMAG, Grenoble, France), PhD computer vision and machine learning (ETHZ), Postdoc biomedical engineering (ETHZ), focusing on deep learning and medical imaging

Our Alumni are:

  • Ismail Elezi, B.Sc. (U. Pristina), M.Sc. (Ca’Foscari) & PhD (Ca’Foscari & ZHAW, July 2020) in computer science, now PostDoc with Prof. Laura Leal-Taixé of TU Munich

  • Yvan Satyawan, B.Sc. computer science (U. Freiburg, Germany), now pursuing a M.Sc. degree in Denmark

  • Katharina Rombach, B.Sc. & M.Sc. (admitted to EDIC and ERDS doctoral schools at EPFL), now PhD student with Prof. Olga Fink of ETH Zurich

  • Reza Kakooee, M.Sc., now PhD student with Prof. Marc Pouly of Lucerne University of Applied Sciences

  • Benjamin Bruno Meier, M.Sc. (Hirschmann scholarship holder & Dr. Waldemar Jucker laureate), now Software Engineer at Argus Data Insights Schweiz AG

  • Gabriel Eyyi, M.Sc., now Software Engineer / Machine Learning Engineer at dizmo AG

  • Thierry Musy, B.Sc., now Partner / Senior Data Scientist at Foursight Digital AG

  • Jan Stampfli, B.Sc., now Big Data Engineer at Migros-Genossenschafts-Bund

 

 

Recent work

  1. Robust and practical deep learning
  2. Learning to learn
  3. Optical music recognition (OMR)
  4. Voice recognition
  5. Data science

Robust and practical deep learning

Deep learning has reached the point of practical applicability in solving day-to-day tasks in many non-AI businesses, for instance manufacturing SMEs. Specific challenges arise and are tackled in our applied research projects, ranging from data quality and quantity issues to higher requirements on robustness and resilience of the models. For instance, segmenting newspaper pages into articles that semantically belong together is a necessary prerequisite for article-based information retrieval on print media collections like e.g. archives and libraries. It is challenging due to vastly differing layouts of papers, various content types and different languages, but commercially very relevant for e.g. media monitoring.

Examples of automatic segmentation

We have developed a semantic segmentation approach based on the visual appearance of each page. We apply a fully convolutional neural network (FCN) that we train in an end-to-end fashion to transform the input image into a segmentation mask in one pass. We show experimentally that the FCN performs very well: it outperforms a deep learning-based commercial solution by a large margin in terms of segmentation quality while in addition being computationally two orders of magnitude more efficient. The whole system is trained with only 5,500 images of which less than 500 are fully labeled.

Newspaper article segmentation architecture

Additionally, the existence of adversarial attacks on convolutional neural networks (CNN) questions the fitness of such models for serious applications. Such attacks manipulate an input image such that misclassification is evoked while still looking normal to a human observer - they are thus not easily detectable. In a different context, backpropagated activations of CNN hidden layers - “feature responses” to a given input - have been helpful to visualize for a human “debugger” what the CNN “looks at” while computing its output. We have proposed a novel detection method for adversarial examples to prevent attacks. We do so by tracking adversarial perturbations in feature responses, allowing for automatic detection using average local spatial entropy. The method does not alter the original network architecture and is fully human-interpretable. Experiments confirm the validity of our approach for state-of-the-art attacks on large-scale models trained on ImageNet.

Detecting adversarial examples using local spatial entropy on feature response maps

Selected references (see also below)

 

Learning to learn

Example clusterings

We have built a novel end-to-end neural network architecture that, once trained, directly outputs a probabilistic clustering of a batch of input examples in one pass. It estimates a distribution over the number of clusters and, for each number of clusters up to a maximum, distributions over the respective data partitioning. The neural network is trained in a supervised fashion to group data by any perceptual similarity criterion based on pairwise labels (same/different group). It does not expect to have seen any of the groups that appear during model application already during training. We demonstrate promising performance on high-dimensional data like images (COIL-100) and speech (TIMIT). We call this learning to cluster. We have also produced a survey and some novel results on the more general topic of learning to learn.

Learning to cluster model architecture

Selected references (see also below)

 

Optical music recognition (OMR)

Detection & recognition confidedences overlayed on a piece of handwritten music

Written music is a large and important part of cultural heritage worldwide. While there are many archives containing thousands of music scores, they are paper-based, so public access is cumbersome or even impossible. Digitization of these scores is currently impossible due to the non-availability of scanning software that can convert hand-written scores to machine-readable format (Optical Music Recognition – OMR). The DeepScore project aims at bringing bleeding edge technology form computer vision the field of OMR. The impact of OMR on how we curate, preserve and access music manuscripts cannot be overstated. Fully functional OMR would lead to a democratization of the musical cultural heritage by enabling cheap and efficient access by everyone. It would also enable more efficient music training, and enable orchestras to run cheaper and rehearse more efficiently.

To facilitate deep learning for OMR, we built the DeepScores dataset with the goal of advancing the state-of-the-art in small object recognition by placing the question of object recognition in the context of scene understanding. DeepScores contains high quality images of musical scores, partitioned into 300; 000 sheets of written music that contain symbols of different shapes and sizes. With close to a hundred million small objects, this makes our dataset not only unique, but also the largest public dataset. DeepScores comes with ground truth for object classification, detection and semantic segmentation. We provide baseline performances for object classification and intuition for the inherent difficulty that DeepScores poses to state-of-the-art object detectors like YOLO or R-CNN.

We introduced a novel object detection method, based on synthetic energy maps and the watershed transform, called Deep Watershed Detector (DWD). Our method is specifically tailored to deal with high resolution images that contain a large number of very small objects and is therefore able to process full pages of written music. We present state-of-the-art detection results of common music symbols and show DWD’s ability to work with synthetic scores equally well as on handwritten music.

Deep Watershed Detector architecture

Selected references (see also below)

 

Voice recognition

My PhD research focused on the task of speaker clustering: grouping speech segments by speaker identity without prior knowledge of the number or identity of speakers (a prerequisite for e.g. content-based media indexing). While speaker identification usually achieved accuracy percentages in their high nineties, the state of the art for the more complex task of clustering performed an order of magnitude worse.

A clustering of 40 speakers from the TIMIT database

My 2009 ACM Multimedia paper on Unfolding Speaker Clustering Potential – a Biomimetic Approach (see also the code) not only analyzed this fact, but also identified deficiencies in modeling the sequence of speech features as the bottleneck responsible for the slump in performance. The prediction of potentially raising speaker clustering performance by an order of magnitude by better sequence modeling has led to exciting discoveries so far. We successively built deep learning models with more clustering capability to exploit the sequence information: a simple CNN, CNN with optimized clustering loos and finally a RNN to improve the capturing of prosodic voice information, to reduce the error rate for pure voice comparison by the predicted rate (see code). Additionally, using a different clustering algorithm on top of the simple CNN feature embeddings also proved valuable.

Architecture of the successful RNN model for speaker clustering

This line of research has also lead to work on other audio processing tasks like media segmentation and classification, musical instrument recognition, audio fingerprinting, or voice transfer, mainly driven forward in student thesis projects.

Selected references (see also below)

 

Data science

I helped in creating one of Europe’s first dedicated research centers for data science, the ZHAW Datalab, and lead it since. Subsequently, my colleagues and I created one of Switzerland’s first continuing education programs in data science, the MAS Data Science, where I teach machine learning. In 2015, we started rolling out the successful Datalab collaboration model country-wide in founding the Swiss Alliance for Data-Intensive Services, a network of industrial and academic partner institutions that also furthered the Swiss Conference on Data Science series of events that started in Winterthur. The experience gained in these activities, together with the feedback from the applied research projects described above, lead to a book I am co-editing together with my colleagues Martin Braschler and Kurt Stockinger.

The data science skill set map

Selected references (see also below)

 

 

Collaborations

I frequently collaborate with industry to work on exciting pattern recognition use cases. Partners from start-ups, SMEs and multi-national enterprises alike.

In academia, I frequently work together with the Machine Learning and Optimization Lab of Martin Jaggi at EPFL, Marcello Pelillo of the Ca’Foscari University of Venice, Juergen Schmidhuber’s group at IDSIA, Friedhelm Schwenker of Ulm University, Insitute of Neural Information Processing, and Boi Faltings of EPFL. We have joint research projects and/or co-supervise PhD students.

If you are intgerested in a collaboration, please contact me.

 

 

Publications

Compare bibliometrics on Google scholar and ResearchGate.

Year Tag Type Publication
2020 robust deep learning conf. paper Stefan Glüge, Mohammadreza Amirian, Dandolo Flumini, and Thilo Stadelmann. How (Not) to Measure Bias in Face Recognition Networks. In: Proceedings of the 9th IAPR TC 3 Workshop on Artificial Neural Networks for Pattern Recognition (ANNPR’20), Springer, LNAI, Winterthur, Switzerland, September 02-04, 2020.
2020 robust deep learning workshop paper Mohammadreza Amirian, Lukas Tuggener, Ricardo Chavarriaga, Yvan Putra Satyawan, Frank-Peter Schilling, Friedhelm Schwenker, and Thilo Stadelmann. Two to Trust: AutoML for Safe Modelling and Interpretable Deep Learning for Robustness. In: Proceedings of the 1st TAILOR Workshop on Trustworthy AI at ECAI 2020, Santiago de Compostela, Spain, September 04-06, 2020. Springer.
2020 reinforcement learning short paper, best poster presentation award Dano Roost, Ralph Meier, Stephan Huschauer, Erik Nygren, Adrian Egli, Andreas Weiler, and Thilo Stadelmann. Improving Sample Efficiency and Multi-Agent Communication in RL-based Train Rescheduling. In: Proceedings of the 7th Swiss Conference on Data Science (SDS’20), Lucerne, Switzerland, June 26, 2020. IEEE.
2019 learning to learn workshop paper Mohammadreza Amirian, Katharina Rombach, Lukas Tuggener, Frank-Peter Schilling, and Thilo Stadelmann. Efficient Deep CNNs for Cross-Modal Automated Computer Vision under Time and Space Constraints. In: AutoCV2 Workshop at European Conference on Machine Learning / European Conference on Principles and Practice of Knowledge Discovery in Databases (ECML-PKDD), Wuerzburg, Germany, September 16-19, 2019.
2019 data science chapter Kurt Stockinger, Martin Braschler, and Thilo Stadelmann. Lessons Learned from Challenging Data Science Case Studies. In: Martin Braschler, Thilo Stadelmann, and Kurt Stockinger (Editors). “Applied Data Science - Lessons Learned for the Data-Driven Business”. Springer, 2019.
2019 data science chapter Lukas Hollenstein, Lukas Lichtensteiger, Thilo Stadelmann, Mohammadreza Amirian, Lukas Budde, Jürg Meierhofer, Rudolf M. Füchslin, and Thomas Friedli. Unsupervised Learning and Simulation for Complexity Management in Business Operations. In: Martin Braschler, Thilo Stadelmann, and Kurt Stockinger (Editors). “Applied Data Science - Lessons Learned for the Data-Driven Business”. Springer, 2019.
2019 robust deep learning chapter Thilo Stadelmann, Vasily Tolkachev, Beate Sick, Jan Stampfli, and Oliver Dürr. Beyond ImageNet - Deep Learning in Industrial Practice. In: Martin Braschler, Thilo Stadelmann, and Kurt Stockinger (Editors). “Applied Data Science - Lessons Learned for the Data-Driven Business”. Springer, 2019.
2019 data science chapter Jürg Meierhofer, Thilo Stadelmann, and Mark Cieliebak. Data Products. In: Martin Braschler, Thilo Stadelmann, and Kurt Stockinger (Editors). “Applied Data Science - Lessons Learned for the Data-Driven Business”. Springer, 2019.
2019 data science chapter Thilo Stadelmann, Kurt Stockinger, Gundula Heinatz-Bürki, and Martin Braschler. Data Scientists. In: Martin Braschler, Thilo Stadelmann, and Kurt Stockinger (Editors). “Applied Data Science - Lessons Learned for the Data-Driven Business”. Springer, 2019.
2019 data science chapter Martin Braschler, Thilo Stadelmann, and Kurt Stockinger. Data Science. In: Martin Braschler, Thilo Stadelmann, and Kurt Stockinger (Editors). “Applied Data Science - Lessons Learned for the Data-Driven Business”. Springer, 2019.
2019 data science chapter Thilo Stadelmann, Martin Braschler, and Kurt Stockinger. Introduction to Applied Data Science. In: Martin Braschler, Thilo Stadelmann, and Kurt Stockinger (Editors). “Applied Data Science - Lessons Learned for the Data-Driven Business”. Springer, 2019.
2019 data science book Martin Braschler, Thilo Stadelmann, and Kurt Stockinger (Editors). Applied Data Science - Lessons Learned for the Data-Driven Business. Springer, 2019.
2019 AI, digital transformation essay Thilo Stadelmann. Wie maschinelles Lernen den Markt verändert. In: Reinhard Haupt, Stephan Schmitz (Editors), “Digitalisierung: Datenhype mit Werteverlust? Ethische Perspektiven für eine Schlüsseltechnologie”, pp. 67-79, ISBN 377516040X, SCM Hänssler, 2019.
2019 learning to learn conf. paper Lukas Tuggener, Mohammadreza Amirian, Katharina Rombach, Stefan Lörwald, Anastasia Varlet, Christian Westermann, and Thilo Stadelmann. Automated Machine Learning in Practice: State of the Art and Recent Results. In: Proceedings of the 6th Swiss Conference on Data Science (SDS’19), Bern, Switzerland, June 14, 2019. IEEE.
2018 optical music recognition workshop paper Ismail Elezi, Lukas Tuggener, Marcello Pelillo, and Thilo Stadelmann. DeepScores and Deep Watershed Detection: current state and open issues. In: Proceedings of the 1st International Workshop on Reading Music Systems (WoRMS’18), Paris, France, September 20, 2018.
2018 robust deep learning invited paper Thilo Stadelmann, Mohammadreza Amirian, Ismail Arabaci, Marek Arnold, Gilbert François Duivesteijn, Ismail Elezi, Melanie Geiger, Stefan Lörwald, Benjamin Bruno Meier, Katharina Rombach, and Lukas Tuggener. Deep Learning in the Wild. In: Proceedings of the 8th IAPR TC 3 Workshop on Artificial Neural Networks for Pattern Recognition (ANNPR’18), Springer, LNAI 11081, pp. 17-38, Siena, Italy, September 19-21, 2018.
2018 robust deep learning conf. paper Mohammadreza Amirian, Friedhelm Schwenker, and Thilo Stadelmann. Trace and Detect Adversarial Attacks on CNNs using Feature Response Maps. In: Proceedings of the 8th IAPR TC 3 Workshop on Artificial Neural Networks for Pattern Recognition (ANNPR’18), Springer, LNAI 11081, pp. 346-358, Siena, Italy, September 19-21, 2018.
2018 voice recognition conf. paper Thilo Stadelmann, Sebastian Glinski-Haefeli, Patrick Gerber, and Oliver Dürr. Capturing Suprasegmental Features of a Voice with RNNs for Improved Speaker Clustering. In: Proceedings of the 8th IAPR TC 3 Workshop on Artificial Neural Networks for Pattern Recognition (ANNPR’18), Springer, LNAI 11081, pp. 333-345, Siena, Italy, September 19-21, 2018.
2018 learning to learn conf. paper Benjamin Bruno Meier, Ismail Elezi, Mohammadreza Amirian, Oliver Dürr, and Thilo Stadelmann. Learning Neural Models for End-to-End Clustering. In: Proceedings of the 8th IAPR TC 3 Workshop on Artificial Neural Networks for Pattern Recognition (ANNPR’18), Springer, LNAI 11081, pp. 126-138, Siena, Italy, September 19-21, 2018.
2018 optical music recognition conf. paper Lukas Tuggener, Ismail Elezi, Jürgen Schmidhuber, and Thilo Stadelmann. Deep watershed detector for music object recognition. In: Proceedings of the 19th International Society for Music Information Retrieval Conference (ISMIR’18), Paris, 23. - 27. September 2018. Paris: Society for Music Information Retrieval. DOI.
2018 voice recognition conf. paper Feliks Hibraj, Sebastiano Vascon, Thilo Stadelmann, and Marcello Pelillo. Speaker clustering using dominant sets. In: Proceedings of the 24th International Conference on Pattern Recognition (ICPR 2018). 24th International Conference on Pattern Recognition (ICPR’18), Beijing, China, 20-28 August 2018. Beijing: IAPR. DOI.
2018 optical music recognition conf. paper Lukas Tuggener, Ismail Elezi, Jürgen Schmidhuber, Marcello Pelillo, and Thilo Stadelmann. DeepScores: a dataset for segmentation, detection and classification of tiny objects. In: Proceedings of the 24th International Conference on Pattern Recognition. 24th International Conference on Pattern Recognition (ICPR’18), Beijing, China, 20-28 August 2018. Beijing: IAPR. 1-6. DOI.
2017 document recognition conf. paper Benjamin Meier, Thilo Stadelmann, Jan Stampfli, Marek Arnold, and Mark Cieliebak. Fully convolutional neural networks for newspaper article segmentation. In: Proceedings of the 14th IAPR International Conference on Document Analysis and Recognition (ICDAR’17). 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), Kyoto Japan, November 13-15, 2017. Kyoto, Japan: CPS. DOI.
2017 voice recognition conf. paper Yanick X. Lukic, Carlo Vogt, Oliver Dürr, and Thilo Stadelmann. Learning Embeddings for Speaker Clustering Based on Voice Equality. In: Proceedings of the 27th IEEE International Workshop on Machine Learning for Signal Processing (MLSP’17). Roppongi, Tokyo, Japan: IEEE. DOI.
2016 voice recognition conf. paper Yanick Lukic, Carlo Vogt, Oliver Dürr, and Thilo Stadelmann. Speaker Identification and Clustering using Convolutional Neural Networks. In: Proceedings of IEEE International Workshop on Machine Learning for Signal Processing (MLSP’16). Salerno: IEEE. DOI.
2016 data science chapter Kurt Stockinger, Thilo Stadelmann, and Andreas Ruckstuhl. Data Scientist als Beruf. Big Data – Grundlagen, Systeme und Nutzungspotenziale, Springer Verlag, Edition HMD 59-81, 2016. DOI.
2015 AI invited paper Jean-Daniel Dessimoz, Jana Koehler, and Thilo Stadelmann. AI in Switzerland. AI Magazine. 36(2), S. 102-105, 2015. DOI.
2015 data science position paper Thilo Stadelmann, Mark Cieliebak, and Kurt Stockinger. Toward automatic data curation for open data. ERCIM News. 2015(100), S. 32-33. DOI.
2014 data science chapter Kurt Stockinger, and Thilo Stadelmann. Data Science für Lehre, Forschung und Praxis. HMD Praxis der Wirtschaftsinformatik. 51(4), S. 469-479, 2014. DOI.
2013 data science conf. paper Thilo Stadelmann, Kurt Stockinger, Martin Braschler, Mark Cieliebak, Gerold Baudinot, Oliver Dürr, and Andreas Ruckstuhl. Applied data science in Europe: challenges for academia in keeping up with a highly demanded topic. In: Proceedings of the 9th European Computer Science Summit (ECSS’13), Amsterdam, October 8–9, 2013.
2012 automatic driving conf. paper Thilo Stadelmann, Sven Johr, Michael Ditze, Florian Dittman, and Viktor Fässler. FABELHAFT - Fahrerablenkung: Entwicklung eines Meta-Fahrerassistenzsystems durch Echtzeit-Audioklassifikation. In Proceedings of 28. VDI-VW Gemeinschaftstagung Fahrerassistenzsysteme und Integrierte Sicherheit ‘12, Wolfsburg, Germany, October 10.-11., 2012. VDI Wissensforum.
2010 voice recognition PhD thesis Thilo Stadelmann. Voice Modeling Methods for Automatic Speaker Recognition. Dissertation, Philipps-Universität Marburg. Available online, 2010.
2010 voice recognition tech report Thilo Stadelmann & Bernd Freisleben. On the MixMax Model and Cepstral Features for Noise-Robust Voice Recognition. Technical report, Philipps-Universität Marburg, April 2010.
2010 voice recognition workshop paper Christian Beecks, Thilo Stadelmann, Bernd Freisleben, and Thomas Seidl. Visual Speaker Model Exploration, In Proceedings of the IEEE International Conference on Multimedia and Expo (ICME’2010), pages 727-728, Singapore, July 19-23, 2010, IEEE.
2010 robust machine learning conf. paper Thilo Stadelmann, Yinghui Wang, Matthew Smith, Ralph Ewerth, and Bernd Freisleben. Rethinking Algorithm Development and Design in Speech Processing. In Proceedings of the 20th International Conference on Pattern Recognition (ICPR’10), pages 4476-4479, Istanbul, Turkey, August 2010a. IAPR.
2010 voice recognition conf. paper Thilo Stadelmann and Bernd Freisleben. Dimension-Decoupled Gaussian Mixture Model for Short Utterance Speaker Recognition. In Proceedings of the 20th International Conference on Pattern Recognition (ICPR’10), pages 1602-1605, Istanbul, Turkey, August 2010a. IAPR.
2009 video analysis workshop paper Markus Mühling, Ralph Ewerth, Thilo Stadelmann, Bing Shi, and Bernd Freisleben. University of Marburg at TRECVID 2009: High-Level Feature Extraction. In Proceedings of TREC Video Retrieval Evaluation Workshop (TRECVid’09). Available online, 2009.
2009 SoA workshop paper Ernst Juhnke, Dominik Seiler, Thilo Stadelmann, Tim Dörnemann, and Bernd Freisleben. LCDL: An Extensible Framework for Wrapping Legacy Code. In Proceedings of International Workshop on @WAS Emerging Research Projects, Applications and Services (ERPAS’09), pages 638-642, Kuala Lumpur, Malaysia, December 2009.
2009 SoA conf. paper Dominik Seiler, Ralph Ewerth, Steffen Heinzl, Thilo Stadelmann, Markus Mühling, Bernd Freisleben, and Manfred Grauer. Eine Service-Orientierte Grid-Infrastruktur zur Unterstützung Medienwissenschaftlicher Filmanalyse. In Proceedings of the Workshop on Gemeinschaften in Neuen Medien (GeNeMe’09”), pages 79-89, Dresden, Germany, September 2009.
2009 voice recognition conf. paper Thilo Stadelmann and Bernd Freisleben. Unfolding Speaker Clustering Potential: A Biomimetic Approach. In Proceedings of the ACM International Conference on Multimedia (ACMMM’09”), pages 185-194, Beijing, China, October 2009. ACM.
2009 voice recognition conf. paper Thilo Stadelmann, Steffen Heinzl, Markus Unterberger, and Bernd Freisleben. WebVoice: A Toolkit for Perceptual Insights into Speech Processing. In Proceedingsof the 2nd International Congress on Image and Signal Processing (CISP’09), pages 4358-4362, Tianjin, China, October 2009.
2009 SoA conf. paper Steffen Heinzl, Markus Mathes, Thilo Stadelmann, Dominik Seiler, Marcel Diegelmann, Helmut Dohmann, and Bernd Freisleben. The Web Service Browser: Automatic Client Generation and Efficient Data Transfer for Web Services. In Proceedings of the 7th IEEE International Conference on Web Services (ICWS’09), pages 743-750, Los Angeles, CA, USA, July 2009. IEEE Press.
2009 SoA journal paper Steffen Heinzl, Dominik Seiler, Ernst Juhnke, Thilo Stadelmann, Ralph Ewerth, Manfred Grauer, and Bernd Freisleben. A Scalable Service-Oriented Architecture for Multimedia Analysis, Synthesis, and Consumption. International Journal of Web and Grid Services, 5(3):219-260, 2009. Inderscience Publishers.
2008 video analysis workshop paper Markus Mühling, Ralph Ewerth, Thilo Stadelmann, Bing Shi, and Bernd Freisleben. University of Marburg at TRECVID 2008: High-Level Feature Extraction. In Proceedings of TREC Video Retrieval Evaluation Workshop (TRECVid’08). Available online, 2008.
2007 video analysis workshop paper Markus Mühling, Ralph Ewerth, Thilo Stadelmann, Bing Shi, Christian Zöfel, and Bernd Freisleben. University of Marburg at TRECVID 2007: Shot Boundary Detection and High-Level Feature Extraction. In Proceedings of TREC Video Retrieval Evaluation Workshop (TRECVid’07). Available online, 2007.
2007 video analysis conf. paper Ralph Ewerth, Markus Mühling, Thilo Stadelmann, Julinda Gllavata, Manfred Grauer, and Bernd Freisleben. Videana: A Software Toolkit for Scientific Film Studies. In Proceedings of the International Workshop on Digital Tools in Film Studies ‘07, pages 1-16, Siegen, Germany, 2007. Transcript Verlag.
2007 video analysis conf. paper Markus Mühling, Ralph Ewerth, Thilo Stadelmann, Bernd Freisleben, Rene Weber, and Klaus Mathiak. Semantic Video Analysis for Psychological Research on Violence in Computer Games. In Proceedings of the ACM International Conference on Image and Video Retrieval (CIVR’07), pages 611-618, Amsterdam, The Netherlands, July 2007. ACM.
2006 video analysis workshop paper Ralph Ewerth, Markus Mühling, Thilo Stadelmann, Ermir Qeli, Björn Agel, Dominik Seiler, and Bernd Freisleben. University of Marburg at TRECVID 2006: Shot Boundary Detection and Rushes Task Results. In Proceedings of TREC Video Retrieval Evaluation Workshop (TRECVid’06). Available online, 2006.
2006 voice recognition conf. paper Thilo Stadelmann and Bernd Freisleben. Fast and Robust Speaker Clustering Using the Earth Mover’s Distance and MixMax Models. In Proceedings of the 31st IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP’06), volume 1, pages 989-992, Toulouse, France, April 2006. IEEE.
2005 video analysis workshop paper Ralph Ewerth, Christian Behringer, Tobias Kopp, Michael Niebergall, Thilo Stadelmann, and Bernd Freisleben. University of Marburg at TRECVID 2005: Shot Boundary Detection and Camera Motion Estimation Results. In Proceedings of TREC Video Retrieval Evaluation Workshop (TRECVid’05). Available online, 2005.
2004 voice recognition diploma thesis Thilo Stadelmann. Sprechererkennung in Videos. Diplomarbeit, Fachhochschule Giessen-Friedberg, 2004.
comments powered by Disqus