Design Wall for Meeting Rooms
(Multimedia Computing Project [V0.97])

[Please read Notes below.]

Important dates

Deliverable 1: Specification and state of the art study (related work) - max 4 pages + images in annex if needed: April 30. Send to nmc@fct.unl.pt. With this deliverable the team members should be identified and should not change after this date.
Deliverable 2: Source code and final report: *** June 14 ***, by email or using a service such as Google Drive. The visualizations and discussions will take place after.

Send the code to nmc@fct.unl.pt in zip format. The file should have the student numbers separated with "-". Example: 11111-2222.zip. It should include all the files required to compile and run the program. The final report should also be included. If large video files are needed please use a service such as Dropbox or Google Drive.

The project and report requirements are presented next. There are some open options and additional or alternative features are encouraged. Examples include adding 3D support for display or a precise face tracker.

Goals

A design wall for meeting rooms with a large shared screen is a collaborative workspace for brainstorming, ideation, and creative collaboration among team members. The screen is a a focal point for collaborative design activities, e.g., for products. Participants can collaborate in real-time, and the design is displayed on the shared screen. This enables iterative design processes, where team members can collectively refine and iterate on design concepts, while also providing feedback and suggestions to one another.

The project goal is then to build a system to control the interaction with the content on the screen. It will be able to display and control images and video in order to see design alternatives, previous approaches or new ideas. Face detection technology can identify participants as they approach the screen, allowing for personalized greetings, or user-specific interactions. It can also be used for analyzing audience reactions and engagement levels during presentations, allowing to adjust content accordingly. Object recognition can be used to detect and analyze physical objects or documents placed in front of the screen, triggering relevant digital content or actions. Simple gesture recognition technology is used for interaction with the screen without physical input devices.

Interaction with the Design Wall

Interaction will be done through a video camera and touch (could be simulated using a mouse).

In the display, is should be possible to visualize images and videos in a grid. Playing the video or displaying the image in high resolution should also be possible.

https://www.clevelandart.org/artlens-gallery/artlens-wall

https://www.youtube.com/watch?v=yUNGUNHPqCM

http://tabler.tv/video-wall.html

Metadata

Each image and video clip should have the following metadata. This metadata is used to compare images. Metadata extraction could be implemented as standalone program or as part of the display and interaction system.

Tags (keywords)

Luminance

Color - based on first moment

Number of faces appearing in the image or video
Edge distribution
Texture characteristics (for images and first frame of the video)
Number of times a specific object (input as an image) appears in the video frame

This metadata can be stored in an XML file, so that it is only necessary to process each image or video once. Additionally, the list of media items for a user or project could also be listed as an XML file.

Development and Library Support

In order to support access to media content, openFrameworks is the suggested framework. The videoPlayer and dirListExample are good starting points for the project. The opencvHaarFinderExample and the cameraLensOffsetExample show how to integrate OpenCV with openFrameworks and also use face detection. The additional ofxCv addon is also useful. See example below.

To integrate XML files (for example to store/retrieve metadata) the ofxXmlSettings and the sample application could be used. In order to build the User Interface ofxGui can be used. This code is included in the distribution but there are others in the openFrameworks site (addons).

Using OpenCV (Image Processing)

// Example using the OfxCv addon//https://github.com/kylemcdonald/ofxCv

#include
          "ofxCvHaarFinder.h"

          #include "ofxCv.h"

          

          using namespace cv;

          using namespace ofxCv;

          

          void c::f()

          {

              ofxCvColorImage img;

              img.setFromPixels(pixelData);

              cv::Mat m = toCv(img.getPixels()) ;

              cv::InputArray src(m) ; // if needed

          }

Edge distribution

Edge distribution can be obtained by using one or more edge filters (see slides) and counting or averaging the results. An edge histogram is a good way to represent the edge distribution. Using OpenCV edge filters can be applied with:
filter2D(src, dst, ddepth , kernel, anchor, delta, BORDER_DEFAULT );

Texture characteristics

Texture can be evaluated using Gabor filters. Currenty, Gabor filters are supported by OpenCV. The kernels (filters) can be generated with getGaborKernel and the filters are applied with filter2D. There are options for the parameters depending on the desired result. Several orientations could be used and frequencies/wavelengths resulting in a bank with multiple filters (for example 6 orientations and 4 frequencies).

Number of times a specific object (input as an image) appears in the image or in the video

To compare images, keypoint based methods usually give good results in image matching and OpenCV includes some of these methods (features2D). There are several examples using this framework in OpenCV that include the different processing stages: 1) keypoint detection; 2) descriptor extraction; 3) descriptor matching. The matchersimple example and description is a good starting point to understand the process. This example shows the new interface and the several detectors. There are different ways to detect the keypoints and extract the descriptors, including SIFT, SURF, ORB and FAST, that lead to different results and processing needs. A simple and effective approach is to base the code on the matchersimple example but using ORB instead of SURF, as SURF is part of the extended/non free features package (xfeatures2d).

Deliverable 1: Specification and Related Work Study

This is the initial specification of the system and a study of related work. Enhancing multimedia information with metadata to improve browse and search operations was explored by other authors in different situations and contexts. Some of these systems use advanced image and video processing algorithms. The main goal of this study is to provide an overview of the state of the art in this area. The results should be summarized and discussed in a short paper (max 4 pages + images in annex if needed, optional format here: https://www.acm.org/publications/proceedings-template). The paper should have the following structure:

Introduction: this section should briefly describe the context and the project.
Related work: should include an overview of related systems. Most of the information is available in scientific papers, such as the ones available from the ACM Digital Library (access is free at FCT/UNL). For outside access please use the VPN. An example:

The specification should include an interface sketch/storyboard as first approach to the interface, considering the requirements and features described above. User interface elements should be identified and described. The sketch can be done using several tools (e.g., pen based interfaces or even digitized drawings) and should be included in the document. This type of low fidelity prototypes provides an initial specification of the interface elements and the sequence of actions.
References: references to websites and papers that are relevant to the project. Last access dates should be included for the websites.

Deliverable 2: Final Report + Code

The following structure is suggested for the final report (up to 6 pages), including part of the content from the initial specification and related work:

Introduction
Related work [based on the state of the art study done previously]
Algorithms and techniques (short description of the processing algorithms and visualization techniques)
Application implementation (short description of how it works)
Class description
Conclusions
References

The report should also include as appendix:

Class diagram, if not included in the report.
User manual (1-2 pages).

Notes [Updated 2/5/2024.]

In order to implement what is required face recognition is not mandatory but some of the projects that you submitted, have work with face recognition - for example, when someone approaches the screen, recognition could be used to exit a screensaver and load the work space for that person. There are many ways to do face recognition and we can use what is provided by OpenCV. There is an addon but it is also possible to use it directly from OpenCV.

Meanwhile, do not forget that the list below should have been implemented by now!

1 - A gallery with images and video (Examples: dirListExample, videoPlayerExample).

2 - The capability to play videos and show images full screen (Example: videoPlayerExample).

3 - The ability to show (and hide using a key) the camera capturing images (Example: videoGrabberExample).

5 - Read and write xml - one xml file for each image or video file with metadata (Example: xmlSettingsExample).

1 - At least one pixel processing algorithm (color or light) applied to the images and the result stored as metadata (Example: videoGrabberExample).

22/3: Introduction to openFrameworks and the project. Experimenting the videoPlayer and the dirListExample. Both these examples are in the distribution in the /examples folder - in the video and input-output folder. The class structure should be planned from the beginning.

15/3: Introduction to openFrameworks and the project. Starting to experiment using the videoPlayer and the dirListExample. Both these examples are in the distribution in the /examples folder - in the video and input-output folders. Together, these two examples support some of the requirements in the project - namely displaying image and video files and listing a folder with this type of content. Both examples follow the standard OF model with setup(), update() and draw() methods. Also included event handlers, for example to handle key press or mouse move. The goal is to continue the example by displaying a gallery of images and videos - and in the process use the tests to specify the interface. It should be possible to play, pause, resume the videos.

The goal for this class is to display images and video in the same window - like a gallery with rows and columns of image and video thumbnails.

Design Wall for Meeting Rooms (Multimedia Computing Project [V0.97])