Pascal Mettes

Department of Information and Computing Sciences, Utrecht University


In this work, a unified approach to water detection in videos is provided from both a discriminative and generative perspective. Although the automatic detection of water entails a wide range of applications, little attention has been given to solve this specific problem. Current literature generally treats the problem as a part of more general recognition tasks such as material recognition or dynamic texture recognition, without distinctively analysing and characterizing the visual properties of water. In order to compensate for this lack of information, a discriminative algorithm is presented here by introducing a hybrid descriptor based on the joint spatial and temporal behaviour of local water surfaces. Furthermore, this work provides a mathematical analysis and intuitive interpretation of linear latent variable modeling for dynamic texture classification. Based on the analysis, a set of improvements is proposed for the purpose of generative water detection specifically.
      In addition, a novel water database is presented in this work, which goes beyond databases from related fields in terms of quantity and variety of both natural and man-made water scenes. Both perspectives are experimentally evaluated on this database and a subset of the DynTex database for the tasks of video segmentation and classification. The experimentation performed on these two tasks indicate the effectiveness of the introduced algorithms for discriminating water from other but related dynamic and static surfaces and objects, outperforming well-known algorithms from directly related fields. The algorithms and database presented here form a basis of the relatively unexplored problem of water detection and can lead to tackling new problems such as (near) real-time detection, large-scale detection in images, and detection with camera movement.


Water Segmentation and Classification [PDF]
Pascal Mettes


On the segmentation and classification of water in videos
Pascal Mettes, Robby Tan and Remco Veltkamp
in proceeding of International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications, 2014, Lisbon, Portugal.

System details

Water database

The water database consists of multiple hours of video material (260 videos in total) of various water and non-water scenes. The images below display a number of examples in the database.

Local discriminative approach

Main idea: Classify each local part of each video independently using novel a temporal descriptor and a well-known spatial descriptor [1]. This algorithm is aided by a pre-processing step in which water colours and reflections are removed. Furthermore, spatio-temporal Markov Random Fields [2] are exploited to create a smooth segmentation from the independent local classifications. The system pipeline is as follows:

Generative approach using Linear Dynamical Systems

Main idea:The second part of this thesis investigated LDS [3] and its possible use for water detection. In the end, 3 improvements are reported to make LDS work for this problem. These improvements include an optimization in the parameter learning stage, a novel distance measure between LDS models, and a pyramid formulation of LDS for global and local detection. See the thesis for more details.

Visual results

Result over multiple frames available on Youtube


  1. Ojala, T., Pietikainen, M., & Maenpaa, T. (2002) "Multiresolution gray-scale and rotation invariant texture classification with local binary patterns.", Pattern Analysis and Machine Intelligence 24(7), 971-987
  2. Boykov, Y. and Kolmogorov, V. (2004). "An experimental comparison of min-cut/max-flow algorithms for energy minimization in vision", Pattern Analysis and Machine Intelligence 26(9), 1124-1137
  3. Doretto, G., Chiuso, A., Wu, Y. N., & Soatto, S. (2003). Dynamic textures. International Journal of Computer Vision, 51(2), 91-109.