doc/filter_design.txt - manifest_repos/ffmpeg - Git at Google

 Filter design
 =============

 This document explains guidelines that should be observed (or ignored with
 good reason) when writing filters for libavfilter.

 In this document, the word “frame” indicates either a video frame or a group
 of audio samples, as stored in an AVFrame structure.


 Format negotiation
 ==================

   The query_formats method should set, for each input and each output links,
   the list of supported formats.

   For video links, that means pixel format. For audio links, that means
   channel layout, sample format (the sample packing is implied by the sample
   format) and sample rate.

   The lists are not just lists, they are references to shared objects. When
   the negotiation mechanism computes the intersection of the formats
   supported at each end of a link, all references to both lists are replaced
   with a reference to the intersection. And when a single format is
   eventually chosen for a link amongst the remaining list, again, all
   references to the list are updated.

   That means that if a filter requires that its input and output have the
   same format amongst a supported list, all it has to do is use a reference
   to the same list of formats.

   query_formats can leave some formats unset and return AVERROR(EAGAIN) to
   cause the negotiation mechanism to try again later. That can be used by
   filters with complex requirements to use the format negotiated on one link
   to set the formats supported on another.


 Frame references ownership and permissions
 ==========================================

   Principle
   ---------

     Audio and video data are voluminous; the frame and frame reference
     mechanism is intended to avoid, as much as possible, expensive copies of
     that data while still allowing the filters to produce correct results.

     The data is stored in buffers represented by AVFrame structures.
     Several references can point to the same frame buffer; the buffer is
     automatically deallocated once all corresponding references have been
     destroyed.

     The characteristics of the data (resolution, sample rate, etc.) are
     stored in the reference; different references for the same buffer can
     show different characteristics. In particular, a video reference can
     point to only a part of a video buffer.

     A reference is usually obtained as input to the filter_frame method or
     requested using the ff_get_video_buffer or ff_get_audio_buffer
     functions. A new reference on an existing buffer can be created with
     av_frame_ref(). A reference is destroyed using
     the av_frame_free() function.

   Reference ownership
   -------------------

     At any time, a reference “belongs” to a particular piece of code,
     usually a filter. With a few caveats that will be explained below, only
     that piece of code is allowed to access it. It is also responsible for
     destroying it, although this is sometimes done automatically (see the
     section on link reference fields).

     Here are the (fairly obvious) rules for reference ownership:

     * A reference received by the filter_frame method belongs to the
       corresponding filter.

     * A reference passed to ff_filter_frame is given away and must no longer
       be used.

     * A reference created with av_frame_ref() belongs to the code that
       created it.

     * A reference obtained with ff_get_video_buffer or ff_get_audio_buffer
       belongs to the code that requested it.

     * A reference given as return value by the get_video_buffer or
       get_audio_buffer method is given away and must no longer be used.

   Link reference fields
   ---------------------

     The AVFilterLink structure has a few AVFrame fields.

     partial_buf is used by libavfilter internally and must not be accessed
     by filters.

     fifo contains frames queued in the filter's input. They belong to the
     framework until they are taken by the filter.

   Reference permissions
   ---------------------

     Since the same frame data can be shared by several frames, modifying may
     have unintended consequences. A frame is considered writable if only one
     reference to it exists. The code owning that reference it then allowed
     to modify the data.

     A filter can check if a frame is writable by using the
     av_frame_is_writable() function.

     A filter can ensure that a frame is writable at some point of the code
     by using the ff_inlink_make_frame_writable() function. It will duplicate
     the frame if needed.

     A filter can ensure that the frame passed to the filter_frame() callback
     is writable by setting the needs_writable flag on the corresponding
     input pad. It does not apply to the activate() callback.


 Frame scheduling
 ================

   The purpose of these rules is to ensure that frames flow in the filter
   graph without getting stuck and accumulating somewhere.

   Simple filters that output one frame for each input frame should not have
   to worry about it.

   There are two design for filters: one using the filter_frame() and
   request_frame() callbacks and the other using the activate() callback.

   The design using filter_frame() and request_frame() is legacy, but it is
   suitable for filters that have a single input and process one frame at a
   time. New filters with several inputs, that treat several frames at a time
   or that require a special treatment at EOF should probably use the design
   using activate().

   activate
   --------

     This method is called when something must be done in a filter; the
     definition of that "something" depends on the semantic of the filter.

     The callback must examine the status of the filter's links and proceed
     accordingly.

     The status of output links is stored in the frame_wanted_out, status_in
     and status_out fields and tested by the ff_outlink_frame_wanted()
     function. If this function returns true, then the processing requires a
     frame on this link and the filter is expected to make efforts in that
     direction.

     The status of input links is stored by the status_in, fifo and
     status_out fields; they must not be accessed directly. The fifo field
     contains the frames that are queued in the input for processing by the
     filter. The status_in and status_out fields contains the queued status
     (EOF or error) of the link; status_in is a status change that must be
     taken into account after all frames in fifo have been processed;
     status_out is the status that have been taken into account, it is final
     when it is not 0.

     The typical task of an activate callback is to first check the backward
     status of output links, and if relevant forward it to the corresponding
     input. Then, if relevant, for each input link: test the availability of
     frames in fifo and process them;  if no frame is available, test and
     acknowledge a change of status using ff_inlink_acknowledge_status(); and
     forward the result (frame or status change) to the corresponding input.
     If nothing is possible, test the status of outputs and forward it to the
     corresponding input(s). If still not possible, return FFERROR_NOT_READY.

     If the filters stores internally one or a few frame for some input, it
     can consider them to be part of the FIFO and delay acknowledging a
     status change accordingly.

     Example code:

     ret = ff_outlink_get_status(outlink);
     if (ret) {
         ff_inlink_set_status(inlink, ret);
         return 0;
     }
     if (priv->next_frame) {
         /* use it */
         return 0;
     }
     ret = ff_inlink_consume_frame(inlink, &frame);
     if (ret < 0)
         return ret;
     if (ret) {
         /* use it */
         return 0;
     }
     ret = ff_inlink_acknowledge_status(inlink, &status, &pts);
     if (ret) {
         /* flush */
         ff_outlink_set_status(outlink, status, pts);
         return 0;
     }
     if (ff_outlink_frame_wanted(outlink)) {
         ff_inlink_request_frame(inlink);
         return 0;
     }
     return FFERROR_NOT_READY;

     The exact code depends on how similar the /* use it */ blocks are and
     how related they are to the /* flush */ block, and needs to apply these
     operations to the correct inlink or outlink if there are several.

     Macros are available to factor that when no extra processing is needed:

     FF_FILTER_FORWARD_STATUS_BACK(outlink, inlink);
     FF_FILTER_FORWARD_STATUS_ALL(outlink, filter);
     FF_FILTER_FORWARD_STATUS(inlink, outlink);
     FF_FILTER_FORWARD_STATUS_ALL(inlink, filter);
     FF_FILTER_FORWARD_WANTED(outlink, inlink);

   filter_frame
   ------------

     For filters that do not use the activate() callback, this method is
     called when a frame is pushed to the filter's input. It can be called at
     any time except in a reentrant way.

     If the input frame is enough to produce output, then the filter should
     push the output frames on the output link immediately.

     As an exception to the previous rule, if the input frame is enough to
     produce several output frames, then the filter needs output only at
     least one per link. The additional frames can be left buffered in the
     filter; these buffered frames must be flushed immediately if a new input
     produces new output.

     (Example: frame rate-doubling filter: filter_frame must (1) flush the
     second copy of the previous frame, if it is still there, (2) push the
     first copy of the incoming frame, (3) keep the second copy for later.)

     If the input frame is not enough to produce output, the filter must not
     call request_frame to get more. It must just process the frame or queue
     it. The task of requesting more frames is left to the filter's
     request_frame method or the application.

     If a filter has several inputs, the filter must be ready for frames
     arriving randomly on any input. Therefore, any filter with several inputs
     will most likely require some kind of queuing mechanism. It is perfectly
     acceptable to have a limited queue and to drop frames when the inputs
     are too unbalanced.

   request_frame
   -------------

     For filters that do not use the activate() callback, this method is
     called when a frame is wanted on an output.

     For a source, it should directly call filter_frame on the corresponding
     output.

     For a filter, if there are queued frames already ready, one of these
     frames should be pushed. If not, the filter should request a frame on
     one of its inputs, repeatedly until at least one frame has been pushed.

     Return values:
     if request_frame could produce a frame, or at least make progress
     towards producing a frame, it should return 0;
     if it could not for temporary reasons, it should return AVERROR(EAGAIN);
     if it could not because there are no more frames, it should return
     AVERROR_EOF.

     The typical implementation of request_frame for a filter with several
     inputs will look like that:

         if (frames_queued) {
             push_one_frame();
             return 0;
         }
         input = input_where_a_frame_is_most_needed();
         ret = ff_request_frame(input);
         if (ret == AVERROR_EOF) {
             process_eof_on_input();
         } else if (ret < 0) {
             return ret;
         }
         return 0;

     Note that, except for filters that can have queued frames and sources,
     request_frame does not push frames: it requests them to its input, and
     as a reaction, the filter_frame method possibly will be called and do
     the work.
	Filter design
	=============

	This document explains guidelines that should be observed (or ignored with
	good reason) when writing filters for libavfilter.

	In this document, the word “frame” indicates either a video frame or a group
	of audio samples, as stored in an AVFrame structure.


	Format negotiation
	==================

	The query_formats method should set, for each input and each output links,
	the list of supported formats.

	For video links, that means pixel format. For audio links, that means
	channel layout, sample format (the sample packing is implied by the sample
	format) and sample rate.

	The lists are not just lists, they are references to shared objects. When
	the negotiation mechanism computes the intersection of the formats
	supported at each end of a link, all references to both lists are replaced
	with a reference to the intersection. And when a single format is
	eventually chosen for a link amongst the remaining list, again, all
	references to the list are updated.

	That means that if a filter requires that its input and output have the
	same format amongst a supported list, all it has to do is use a reference
	to the same list of formats.

	query_formats can leave some formats unset and return AVERROR(EAGAIN) to
	cause the negotiation mechanism to try again later. That can be used by
	filters with complex requirements to use the format negotiated on one link
	to set the formats supported on another.


	Frame references ownership and permissions
	==========================================

	Principle
	---------

	Audio and video data are voluminous; the frame and frame reference
	mechanism is intended to avoid, as much as possible, expensive copies of
	that data while still allowing the filters to produce correct results.

	The data is stored in buffers represented by AVFrame structures.
	Several references can point to the same frame buffer; the buffer is
	automatically deallocated once all corresponding references have been
	destroyed.

	The characteristics of the data (resolution, sample rate, etc.) are
	stored in the reference; different references for the same buffer can
	show different characteristics. In particular, a video reference can
	point to only a part of a video buffer.

	A reference is usually obtained as input to the filter_frame method or
	requested using the ff_get_video_buffer or ff_get_audio_buffer
	functions. A new reference on an existing buffer can be created with
	av_frame_ref(). A reference is destroyed using
	the av_frame_free() function.

	Reference ownership
	-------------------

	At any time, a reference “belongs” to a particular piece of code,
	usually a filter. With a few caveats that will be explained below, only
	that piece of code is allowed to access it. It is also responsible for
	destroying it, although this is sometimes done automatically (see the
	section on link reference fields).

	Here are the (fairly obvious) rules for reference ownership:

	* A reference received by the filter_frame method belongs to the
	corresponding filter.

	* A reference passed to ff_filter_frame is given away and must no longer
	be used.

	* A reference created with av_frame_ref() belongs to the code that
	created it.

	* A reference obtained with ff_get_video_buffer or ff_get_audio_buffer
	belongs to the code that requested it.

	* A reference given as return value by the get_video_buffer or
	get_audio_buffer method is given away and must no longer be used.

	Link reference fields
	---------------------

	The AVFilterLink structure has a few AVFrame fields.

	partial_buf is used by libavfilter internally and must not be accessed
	by filters.

	fifo contains frames queued in the filter's input. They belong to the
	framework until they are taken by the filter.

	Reference permissions
	---------------------

	Since the same frame data can be shared by several frames, modifying may
	have unintended consequences. A frame is considered writable if only one
	reference to it exists. The code owning that reference it then allowed
	to modify the data.

	A filter can check if a frame is writable by using the
	av_frame_is_writable() function.

	A filter can ensure that a frame is writable at some point of the code
	by using the ff_inlink_make_frame_writable() function. It will duplicate
	the frame if needed.

	A filter can ensure that the frame passed to the filter_frame() callback
	is writable by setting the needs_writable flag on the corresponding
	input pad. It does not apply to the activate() callback.


	Frame scheduling
	================

	The purpose of these rules is to ensure that frames flow in the filter
	graph without getting stuck and accumulating somewhere.

	Simple filters that output one frame for each input frame should not have
	to worry about it.

	There are two design for filters: one using the filter_frame() and
	request_frame() callbacks and the other using the activate() callback.

	The design using filter_frame() and request_frame() is legacy, but it is
	suitable for filters that have a single input and process one frame at a
	time. New filters with several inputs, that treat several frames at a time
	or that require a special treatment at EOF should probably use the design
	using activate().

	activate
	--------

	This method is called when something must be done in a filter; the
	definition of that "something" depends on the semantic of the filter.

	The callback must examine the status of the filter's links and proceed
	accordingly.

	The status of output links is stored in the frame_wanted_out, status_in
	and status_out fields and tested by the ff_outlink_frame_wanted()
	function. If this function returns true, then the processing requires a
	frame on this link and the filter is expected to make efforts in that
	direction.

	The status of input links is stored by the status_in, fifo and
	status_out fields; they must not be accessed directly. The fifo field
	contains the frames that are queued in the input for processing by the
	filter. The status_in and status_out fields contains the queued status
	(EOF or error) of the link; status_in is a status change that must be
	taken into account after all frames in fifo have been processed;
	status_out is the status that have been taken into account, it is final
	when it is not 0.

	The typical task of an activate callback is to first check the backward
	status of output links, and if relevant forward it to the corresponding
	input. Then, if relevant, for each input link: test the availability of
	frames in fifo and process them; if no frame is available, test and
	acknowledge a change of status using ff_inlink_acknowledge_status(); and
	forward the result (frame or status change) to the corresponding input.
	If nothing is possible, test the status of outputs and forward it to the
	corresponding input(s). If still not possible, return FFERROR_NOT_READY.

	If the filters stores internally one or a few frame for some input, it
	can consider them to be part of the FIFO and delay acknowledging a
	status change accordingly.

	Example code:

	ret = ff_outlink_get_status(outlink);
	if (ret) {
	ff_inlink_set_status(inlink, ret);
	return 0;
	}
	if (priv->next_frame) {
	/* use it */
	return 0;
	}
	ret = ff_inlink_consume_frame(inlink, &frame);
	if (ret < 0)
	return ret;
	if (ret) {
	/* use it */
	return 0;
	}
	ret = ff_inlink_acknowledge_status(inlink, &status, &pts);
	if (ret) {
	/* flush */
	ff_outlink_set_status(outlink, status, pts);
	return 0;
	}
	if (ff_outlink_frame_wanted(outlink)) {
	ff_inlink_request_frame(inlink);
	return 0;
	}
	return FFERROR_NOT_READY;

	The exact code depends on how similar the /* use it */ blocks are and
	how related they are to the /* flush */ block, and needs to apply these
	operations to the correct inlink or outlink if there are several.

	Macros are available to factor that when no extra processing is needed:

	FF_FILTER_FORWARD_STATUS_BACK(outlink, inlink);
	FF_FILTER_FORWARD_STATUS_ALL(outlink, filter);
	FF_FILTER_FORWARD_STATUS(inlink, outlink);
	FF_FILTER_FORWARD_STATUS_ALL(inlink, filter);
	FF_FILTER_FORWARD_WANTED(outlink, inlink);

	filter_frame
	------------

	For filters that do not use the activate() callback, this method is
	called when a frame is pushed to the filter's input. It can be called at
	any time except in a reentrant way.

	If the input frame is enough to produce output, then the filter should
	push the output frames on the output link immediately.

	As an exception to the previous rule, if the input frame is enough to
	produce several output frames, then the filter needs output only at
	least one per link. The additional frames can be left buffered in the
	filter; these buffered frames must be flushed immediately if a new input
	produces new output.

	(Example: frame rate-doubling filter: filter_frame must (1) flush the
	second copy of the previous frame, if it is still there, (2) push the
	first copy of the incoming frame, (3) keep the second copy for later.)

	If the input frame is not enough to produce output, the filter must not
	call request_frame to get more. It must just process the frame or queue
	it. The task of requesting more frames is left to the filter's
	request_frame method or the application.

	If a filter has several inputs, the filter must be ready for frames
	arriving randomly on any input. Therefore, any filter with several inputs
	will most likely require some kind of queuing mechanism. It is perfectly
	acceptable to have a limited queue and to drop frames when the inputs
	are too unbalanced.

	request_frame
	-------------

	For filters that do not use the activate() callback, this method is
	called when a frame is wanted on an output.

	For a source, it should directly call filter_frame on the corresponding
	output.

	For a filter, if there are queued frames already ready, one of these
	frames should be pushed. If not, the filter should request a frame on
	one of its inputs, repeatedly until at least one frame has been pushed.

	Return values:
	if request_frame could produce a frame, or at least make progress
	towards producing a frame, it should return 0;
	if it could not for temporary reasons, it should return AVERROR(EAGAIN);
	if it could not because there are no more frames, it should return
	AVERROR_EOF.

	The typical implementation of request_frame for a filter with several
	inputs will look like that:

	if (frames_queued) {
	push_one_frame();
	return 0;
	}
	input = input_where_a_frame_is_most_needed();
	ret = ff_request_frame(input);
	if (ret == AVERROR_EOF) {
	process_eof_on_input();
	} else if (ret < 0) {
	return ret;
	}
	return 0;

	Note that, except for filters that can have queued frames and sources,
	request_frame does not push frames: it requests them to its input, and
	as a reaction, the filter_frame method possibly will be called and do
	the work.