When working with video we must consider several functions, including (of course) how to read and write video files. We must also think about how to actually play back such files on the screen.
The first thing we need is the CvCapture
device. This
structure contains the information needed for reading frames from a camera or video file. Depending on the source, we use one of two
different calls to create and initialize a CvCapture
structure.
CvCapture* cvCreateFileCapture( const char* filename ); CvCapture* cvCreateCameraCapture( int index );
In the case of cvCreateFileCapture()
, we can simply
give a filename for an MPG or AVI file and OpenCV will open the file and prepare to read it.
If the open is successful and we are able to start reading frames, a pointer to an
initialized CvCapture
structure will be
returned.
A lot of people don't always check these sorts of things, thinking that nothing will go
wrong. Don't do that here. The returned pointer will be NULL
if for some reason the file could not be opened (e.g., if the file does
not exist), but cvCreateFileCapture()
will also return a
NULL
pointer if the codec with which the video is
compressed is not known. The subtleties of compression codecs are beyond the scope of this book, but in general you will need to have
the appropriate library already resident on your computer in order to successfully read the
video file. For example, if you want to read a file encoded with DIVX or MPG4 compression on
a Windows machine, there are specific DLLs that provide the necessary resources to decode
the video. This is why it is always important to check the return value of cvCreateFileCapture()
, because even if it works on one machine
(where the needed DLL is available) it might not work on another machine (where that codec
DLL is missing). Once we have the CvCapture
structure, we
can begin reading frames and do a number of other things. But before we get into that, let's
take a look at how to capture images from a camera.
The routine cvCreateCameraCapture()
works very much
like cvCreateFileCapture()
except without the headache
from the codecs.[45] In this case we give an identifier that
indicates which camera we would like to access and how we expect the operating system to
talk to that camera. For the former, this is just an identification number that is zero (0)
when we only have one camera, and increments upward when there are multiple cameras on the
same system. The other part of the identifier is called the domain of the camera and indicates (in essence) what type of camera we have.
The domain can be any of the predefined constants shown in Table 4-3.
Table 4-3. Camera "domain" indicates where HighGUI should look for your camera
Camera capture constan |
Numerical value |
---|---|
|
0 |
|
100 |
|
200 |
|
200 |
|
200 |
|
300 |
|
300 |
|
300 |
|
300 |
When we call cvCreateCameraCapture()
, we pass in an
identifier that is just the sum of the domain index and the camera index. For
example:
CvCapture* capture = cvCreateCameraCapture( CV_CAP_FIREWIRE );
In this example, cvCreateCameraCapture()
will attempt
to open the first (i.e., number-zero) Firewire camera. In most cases, the domain is
unnecessary when we have only one camera; it is sufficient to use CV_CAP_ANY
(which is conveniently equal to 0, so we don't even have to type
that in). One last useful hint before we move on: you can pass -1
to cvCreateCameraCapture()
, which will
cause OpenCV to open a window that allows you to select the desired camera.
int cvGrabFrame( CvCapture* capture ); IplImage* cvRetrieveFrame( CvCapture* capture ); IplImage* cvQueryFrame( CvCapture* capture );
Once you have a valid CvCapture
object, you can
start grabbing frames. There are two ways to do this. One way is to call cvGrabFrame()
, which takes the CvCapture*
pointer and returns an integer. This integer will be 1 if the grab
was successful and 0 if the grab failed. The cvGrabFrame()
function copies the captured image to an internal buffer that
is invisible to the user. Why would you want OpenCV to put the frame somewhere you can't
access it? The answer is that this grabbed frame is unprocessed, and cvGrabFrame()
is designed simply to get it onto the computer
as quickly as possible.
Once you have called cvGrabFrame()
, you can then
call cvRetrieveFrame()
. This function will do any
necessary processing on the frame (such as the decompression stage in the codec) and then
return an IplImage*
pointer that points to another
internal buffer (so do not rely on this image, because it will be overwritten the next
time you call cvGrabFrame()
). If you want to do
anything in particular with this image, copy it elsewhere first. Because this pointer
points to a structure maintained by OpenCV itself, you are not required to release the
image and can expect trouble if you do so.
Having said all that, there is a somewhat simpler method called cvQueryFrame()
. This is, in effect, a combination of cvGrabFrame()
and cvRetrieveFrame()
; it also returns the same IplImage*
pointer as cvRetrieveFrame()
did.
It should be noted that, with a video file, the frame is automatically advanced
whenever a cvGrabFrame()
call is made. Hence a
subsequent call will retrieve the next frame automatically.
Once you are done with the CvCapture
device, you
can release it with a call to cvReleaseCapture()
. As
with most other de-allocators in OpenCV, this routine takes a pointer to the CvCapture*
pointer:
void cvReleaseCapture( CvCapture** capture );
There are many other things we can do with the CvCapture
structure. In particular, we can check and set various properties of the video source:
double cvGetCaptureProperty( CvCapture* capture, int property_id ); int cvSetCaptureProperty( CvCapture* capture, int property_id, double value );
The routine cvGetCaptureProperty()
accepts any of
the property IDs shown in Table 4-4.
Table 4-4. Video capture properties used by cvGetCaptureProperty() and cvSetCaptureProperty()
Video capture property |
Numerical value |
---|---|
|
0 |
|
1 |
|
2 |
|
3 |
|
4 |
|
5 |
|
6 |
|
7 |
Most of these properties are self explanatory. POS_MSEC
is the current position in a video file, measured in milliseconds. POS_FRAME
is the current position in frame number. POS_AVI_RATIO
is the position given as a number between 0 and 1 (this is
actually quite useful when you want to position a trackbar to allow folks to navigate around your video). FRAME_WIDTH
and FRAME_HEIGHT
are the dimensions of the individual frames of the video to be
read (or to be captured at the camera's current settings). FPS
is specific to video files and indicates the number of frames per second
at which the video was captured; you will need to know this if you want to play back your
video and have it come out at the right speed. FOURCC
is the four-character code for the compression codec to be used for the video you are currently reading. FRAME_COUNT
should be the total
number of frames in the video, but this figure is not entirely reliable.
All of these values are returned as type double
,
which is perfectly reasonable except for the case of FOURCC
(FourCC) [FourCC85]. Here you will have to recast the result in order
to interpret it, as described in Example 4-3.
Example 4-3. Unpacking a four-character code to identify a video codec
double f = cvGetCaptureProperty( capture, CV_CAP_PROP_FOURCC ); char* fourcc = (char*) (&f);
For each of these video capture properties, there is a corresponding cvSetCapture Property()
function that will attempt to set the
property. These are not all entirely meaningful; for example, you should not be setting
the FOURCC
of a video you are currently reading.
Attempting to move around the video by setting one of the position properties will work,
but only for some video codecs (we'll have more to say about video codecs in the next section).
The other thing we might want to do with video is writing it out to disk. OpenCV makes this easy; it is essentially the same as reading video but with a few extra details.
First we must create a CvVideoWriter
device, which
is the video writing analogue of CvCapture
. This device
will incorporate the following functions.
CvVideoWriter* cvCreateVideoWriter( const char* filename, int fourcc, double fps, CvSize frame_size, int is_color = 1 ); int cvWriteFrame( CvVideoWriter* writer, const IplImage* image ); void cvReleaseVideoWriter( CvVideoWriter** writer );
You will notice that the video writer requires a few extra arguments. In addition to the filename, we have to tell the writer what codec to use, what the frame rate is, and how big the frames will be. Optionally we can tell OpenCV if the frames are black and white or color (the default is color).
Here, the codec is indicated by its four-character code. (For those of you who are not
experts in compression codecs, they all have a unique four-character identifier associated with
them). In this case the int
that is named fourcc
in the argument list for cvCreateVideoWriter()
is actually the four characters of the fourcc
packed together. Since this comes up relatively often,
OpenCV provides a convenient macro CV_FOURCC(c0,c1,c2,c3)
that will do the bit packing for you.
Once you have a video writer, all you have to do is call cvWriteFrame()
and pass in the CvVideoWriter*
pointer and the IplImage*
pointer for the image you want to write out.
Once you are finished, you must call CvReleaseVideoWriter()
in order to close the writer and the file you were
writing to. Even if you are normally a bit sloppy about de-allocating things at the end of
a program, do not be sloppy about this. Unless you explicitly release the video writer,
the video file to which you are writing may be corrupted.
[45] Of course, to be completely fair, we should probably confess that the headache caused by different codecs has been replaced by the analogous headache of determining which cameras are (or are not) supported on our system.
Get Learning OpenCV now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.