Multimedia Computing Project: Interactive 3D Media Gallery [V0.99]

[Please read Notes below.]

Important dates

The project and report requirements are presented next.

Create a (preferably) 3D space where a user can move and interact with images and videos, displayed as virtual frames in a gallery, featuring overlays with extracted information such as:

- luminance

- edges

- dominant colors

- ...

Additional metadata like edge histogram information, but also luminance, color, texture, number of faces or keypoints/descriptors enable automatic similarity detection, allowing users to navigate and group image categories based on visual features. Properties can also be revealed by touching an image or video.

Experiment with different behaviors for images or videos with different characteristics - images with dominant warm colors could move in one direction, images with many edges have a certain type of motion while videos with many cuts could have another. For videos it is only mandatory to have a measure of rhythm or action, based on differences between images. You do not have to calculate the other metadata.

Use simple based gesture recognition to navigate image/video galleries.

The necessary metadata to support the operations above should be represented in XML format.

Please play attention to the data and results that you use in the project. Results should be coherent and aligned with the data that is used.

Deliverable 1: Specification

This is the initial specification of the system. The specification should include an interface sketch/storyboard as first approach to the interface, considering the requirements and features described above. User interface elements should be identified and described. The sketch can be done using several tools (e.g., pen based interfaces or even digitized drawings) and should be included in the document. This type of low fidelity prototypes provides an initial specification of the interface elements and the sequence of actions. This should also include a preliminary class structure.

Deliverable 2: Final Report + Code

The following structure is suggested for the final report (up to 6 pages), including part of the content from the initial specification:

The report should also include as appendix:


Notes [Updated 22/5/2025.]


21/5

The easyCam example can be used for the 3D gallery. The simplest version is to use the camera as in the example and replace the objects in the example by the gallery: walls, floor, doors, etc.. A wall can be a ofBoxPrimitive and to draw images you can use:

ofPushMatrix();
  paintingTexture.bind();
  p.draw();
  paintingTexture.unbind();
ofPopMatrix();

// p is an ofPlanePrimitive
// paintingTexture is the image: paintingTexture.load("painting.jpg")

20/5:

Using OpenCV (Image Processing)

// Example using the OfxCv addon
// 
https://github.com/kylemcdonald/ofxCv


#include "ofxCvHaarFinder.h"
#include "ofxCv.h"

using namespace cv;
using namespace ofxCv;

void c::f()
{
    ofxCvColorImage img;
    img.setFromPixels(pixelData);
    cv::Mat m = toCv(img.getPixels()) ;
    cv::InputArray src(m) ; // if needed
}


Edge distribution

Edge distribution can be obtained by using one or more edge filters (see slides) and counting or averaging the results. An edge histogram is a good way to represent the edge distribution. Using OpenCV edge filters can be applied with:
filter2D(src, dst, ddepth , kernel, anchor, delta, BORDER_DEFAULT );

Texture characteristics

Texture can be evaluated using Gabor filters. Currenty, Gabor filters are supported by OpenCV. The kernels (filters) can be generated with getGaborKernel and the filters are applied with filter2D. There are options for the parameters depending on the desired result. Several orientations could be used and frequencies/wavelengths resulting in a bank with multiple filters (for example 6 orientations and 4 frequencies).


28/3:


1 - At least one pixel processing algorithm (color or light) applied to the images and the result stored as metadata (Example: videoGrabberExample).

2 - Simple motion detection using the camera (Example: opencvExample).

3 - Use both methods and both addons to interface OpenCV.



21/3:

1 - A gallery with images and video (Examples: dirListExample, videoPlayerExample).

2 - The capability to play videos and show images full screen (Example: videoPlayerExample).

3 - The ability to show (and hide using a key) the camera capturing images (Example: videoGrabberExample).

4 - Face detection on the camera image (Example: opencvHaarfFinderExample).

5 - Read and write xml - one xml file for each image or video file with metadata (Example: xmlSettingsExample).


14/3: Introduction to openFrameworks and the project. Starting to experiment using the videoPlayer and the dirListExample. Both these examples are in the distribution in the /examples folder - in the video and input-output folders. Together, these two examples support some of the requirements in the project - namely displaying image and video files and listing a folder with this type of content. Both examples follow the standard OF model with setup(), update() and draw() methods. Also included event handlers, for example to handle key press or mouse move. The goal is to continue the example by displaying a gallery of images and videos. It should be possible to play, pause, resume the videos.

The goal for this class is to display images and video in the same window - like a gallery with rows and columns of image and video thumbnails.