注册 登录
  • 欢迎访问Sharezer Blog

AVFoundation Programming Guide(官方文档翻译)完整版中英对照

IOS sharezer 来源:yofer张耀琦 98次浏览 未收录 0个评论

完整版 – AVFoundation Programming Guide

– 第 1 章:About AVFoundation – AVFoundation 概述
– 第 2 章:Using Assets – 使用 Assets
– 第 3 章:Playback – 播放
– 第 4 章:Editing – 编辑
– 第 5 章:Still and Video Media Capture – 静态视频媒体捕获
– 第 6 章:Export – 输出
– 第 7 章:Time and Media Representations 时间和媒体表现


苹果源文档地址 – 点击这里

About AVFoundation – AVFoundation 概述

AVFoundation is one of several frameworks that you can use to play and create time-based audiovisual media. It provides an Objective-C interface you use to work on a detailed level with time-based audiovisual data. For example, you can use it to examine, create, edit, or reencode media files. You can also get input streams from devices and manipulate video during realtime capture and playback. Figure I-1 shows the architecture on iOS.

AVFoundation 是可以用它来播放和创建基于时间的视听媒体的几个框架之一。它提供了基于时间的视听数据的详细级别上的 Objective-C 接口。例如,你可以用它来检查,创建,编辑或重新编码媒体文件。您也可以从设备得到输入流和在实时捕捉回放过程中操控视频。图 I-1 显示了 iOS 上的架构。

AVFoundation Programming Guide(官方文档翻译)完整版中英对照

Figure I-2 shows the corresponding media architecture on OS X.

图 1-2 显示了OS X上相关媒体的架构:

AVFoundation Programming Guide(官方文档翻译)完整版中英对照

You should typically use the highest-level abstraction available that allows you to perform the tasks you want.

  • If you simply want to play movies, use the AVKit framework.
  • On iOS, to record video when you need only minimal control over format, use the UIKit framework(UIImagePickerController)

Note, however, that some of the primitive data structures that you use in AV Foundation—including time-related data structures and opaque objects to carry and describe media data—are declared in the Core Media framework.


  • 如果你只是想播放电影,使用 AVKit 框架。
  • 在 iOS 上,当你在格式上只需要最少的控制,使用 UIKit 框架录制视频。(UIImagePickerController).

但是请注意,某些在AV Foundation 中使用的原始数据结构,包括时间相关的数据结构和不透明数据对象的传递和描述媒体数据是在Core Media framework声明的。

At a Glance – 摘要

There are two facets to the AVFoundation framework—APIs related to video and APIs related just to audio. The older audio-related classes provide easy ways to deal with audio. They are described in the Multimedia Programming Guide, not in this document.

You can also configure the audio behavior of your application using AVAudioSession; this is described in Audio Session Programming Guide.

AVFoundation 框架包含视频相关的 APIs 和音频相关的 APIs。旧的音频相关类提供了简便的方法来处理音频。他们在 Multimedia Programming Guide, 中介绍,不在这个文档中。

您还可以使用 AVAudioSession 来配置应用程序的音频行为; 这是在 Audio Session Programming Guide 文档中介绍的。

Representing and Using Media with AVFoundation – 用 AVFoundation 表示和使用媒体

The primary class that the AV Foundation framework uses to represent media is AVAsset. The design of the framework is largely guided by this representation. Understanding its structure will help you to understand how the framework works. An AVAssetinstance is an aggregated representation of a collection of one or more pieces of media data (audio and video tracks). It provides information about the collection as a whole, such as its title, duration, natural presentation size, and so on. AVAsset is not tied to particular data format. AVAsset is the superclass of other classes used to create asset instances from media at a URL (see Using Assets) and to create new compositions (see Editing).

AV Foundation框架用来表示媒体的主要类是 AVAsset。框架的设计主要是由这种表示引导。了解它的结构将有助于您了解该框架是如何工作的。一个 AVAsset 实例的媒体数据的一个或更多个(音频和视频轨道)的集合的聚集表示。它规定将有关集合的信息作为一个整体,如它的名称,时间,自然呈现大小等的信息。 AVAsset 是不依赖于特定的数据格式。 AVAsset是常常从 URL 中的媒体创建资产实例的这种类父类(请参阅 Using Assets),并创造新的成分(见 Editing)。

Each of the individual pieces of media data in the asset is of a uniform type and called a track. In a typical simple case, one track represents the audio component, and another represents the video component; in a complex composition, however, there may be multiple overlapping tracks of audio and video. Assets may also have metadata.

Asset中媒体数据的各个部分,每一个都是一个统一的类型,把这个类型称为 “轨道”。在一个典型简单的情况下,一个轨道代表这个音频组件,另一个代表视频组件。然而复杂的组合中,有可能是多个重叠的音频和视频轨道。Assets 也可能有元数据。

A vital concept in AV Foundation is that initializing an asset or a track does not necessarily mean that it is ready for use. It may require some time to calculate even the duration of an item (an MP3 file, for example, may not contain summary information). Rather than blocking the current thread while a value is being calculated, you ask for values and get an answer back asynchronously through a callback that you define using a block.

AV Foundation 中一个非常重要的概念是:初始化一个 asset 或者一个轨道并不一定意味着它已经准备好可以被使用。这可能需要一些时间来计算一个项目的持续时间(例如一个 MP3 文件,其中可能不包含摘要信息)。而不是当一个值被计算的时候阻塞当前线程,你访问这个值,并且通过调用你定义的一个 block 来得到异步返回。

Relevant Chapters: Using AssetsTime and Media Representations

相关章节:Using AssetsTime and Media Representations

Playback – 播放

AVFoundation allows you to manage the playback of asset in sophisticated ways. To support this, it separates the presentation state of an asset from the asset itself. This allows you to, for example, play two different segments of the same asset at the same time rendered at different resolutions. The presentation state for an asset is managed by a player item object; the presentation state for each track within an asset is managed by a player item track object. Using the player item and player item tracks you can, for example, set the size at which the visual portion of the item is presented by the player, set the audio mix parameters and video composition settings to be applied during playback, or disable components of the asset during playback.

AVFoundation允许你用一种复杂的方式来管理asset的播放。为了支持这一点,它将一个asset的呈现状态从asset自身中分离出来。例如允许你在不同的分辨率下同时播放同一个asset中的两个不同的片段。一个asset的呈现状态是由player item对象管理的。Asset中的每个轨道的呈现状态是由player item track对象管理的。例如使用player itemplayer item tracks,你可以设置被播放器呈现的项目中可视的那一部分,设置音频的混合参数以及被应用于播放期间的视频组合设定,或者播放期间的禁用组件。

You play player items using a player object, and direct the output of a player to the Core Animation layer. You can use a player queue to schedule playback of a collection of player items in sequence.

你可以使用一个 player 对象来播放播放器项目,并且直接输出一个播放器给核心动画层。你可以使用一个 player queue(player 对象的队列)去给队列中player items集合中的播放项目安排序列。

Relevant Chapter: Playback


Reading, Writing, and Reencoding Assets – 读取,写入和重新编码 Assets

AVFoundation allows you to create new representations of an asset in several ways. You can simply reencode an existing asset, or—in iOS 4.1 and later—you can perform operations on the contents of an asset and save the result as a new asset.

AVFoundation 允许你用几种方式创建新的 asset 的表现形式。你可以简单将已经存在的 asset 重新编码,或者在 iOS4.1 以及之后的版本中,你可以在一个 asset 的目录中执行一些操作并且将结果保存为一个新的 asset

You use an export session to reencode an existing asset into a format defined by one of a small number of commonly-used presets. If you need more control over the transformation, in iOS 4.1 and later you can use an asset reader and asset writer object in tandem to convert an asset from one representation to another. Using these objects you can, for example, choose which of the tracks you want to be represented in the output file, specify your own output format, or modify the asset during the conversion process.

你可以使用 export session 将一个现有的asset重新编码为一个小数字,这个小数字是常用的预先设定好的一些小数字中的一个。如果在转换中你需要更多的控制,在 iOS4.1 已经以后的版本中,你可以使用 asset reader 和 asset writer 对象串联的一个一个的转换。例如你可以使用这些对象选择在输出的文件中想要表示的轨道,指定你自己的输出格式,或者在转换过程中修改这个asset

To produce a visual representation of the waveform, you use an asset reader to read the audio track of an asset.

为了产生波形的可视化表示,你可以使用asset reader去读取asset中的音频轨道。

Relevant Chapter: Using Assets

相关章节:Using Assets

Thumbnails – 缩略图

To create thumbnail images of video presentations, you initialize an instance of AVAssetImageGenerator using the asset from which you want to generate thumbnails. AVAssetImageGenerator uses the default enabled video tracks to generate images.

创建视频演示图像的缩略图,使用想要生成缩略图的asset初始化一个 AVAssetImageGenerator 的实例。AVAssetImageGenerator 使用默认启用视频轨道来生成图像。

Relevant Chapter: Using Assets

相关章节:Using Assets

Editing – 编辑

AVFoundation uses compositions to create new assets from existing pieces of media (typically, one or more video and audio tracks). You use a mutable composition to add and remove tracks, and adjust their temporal orderings. You can also set the relative volumes and ramping of audio tracks; and set the opacity, and opacity ramps, of video tracks. A composition is an assemblage of pieces of media held in memory. When you export a composition using an export session, it’s collapsed to a file.

AVFoundation 使用 compositions 去从现有的媒体片段(通常是一个或多个视频和音频轨道)创建新的 assets 。你可以使用一个可变成分去添加和删除轨道,并调整它们的时间排序。你也可以设置相对音量和增加音频轨道;并且设置不透明度,浑浊坡道,视频跟踪。一种组合物,是一种在内存中存储的介质的组合。当年你使用 export session 导出一个成份,它会坍塌到一个文件中。

You can also create an asset from media such as sample buffers or still images using an asset writer.

你也可以从媒体上创建一个asset,比如使用asset writer. 的示例缓冲区或静态图像。

Relevant Chapter: Editing


Still and Video Media Capture – 静态和视频媒体捕获

Recording input from cameras and microphones is managed by a capture session. A capture session coordinates the flow of data from input devices to outputs such as a movie file. You can configure multiple inputs and outputs for a single session, even when the session is running. You send messages to the session to start and stop data flow.

从相机和麦克风记录输入是由一个 capture session 管理的。一个 capture session 协调从输入设备到输出的数据流,比如一个电影文件。你可以为一个单一的 session 配置多个输入和输出,甚至 session 正在运行的时候也可以。你将消息发送到 session 去启动和停止数据流。

In addition, you can use an instance of a preview layer to show the user what a camera is recording.

此外,你可以使用 preview layer 的一个实例来向用户显示一个相机是正在录制的。

Relevant Chapter: Still and Video Media Capture

相关章节:Still and Video Media Capture

Concurrent Programming with AVFoundation – AVFoundation 并发编程

Callbacks from AVFoundation—invocations of blocks, key-value observers, and notification handlers—are not guaranteed to be made on any particular thread or queue. Instead, AVFoundation invokes these handlers on threads or queues on which it performs its internal tasks.

AVFoundation 回调,比如块的调用、键值观察者以及通知处理程序,都不能保证在任何特定的线程或队列进行。相反,AVFoundation 在线程或者执行其内部任务的队列上调用这些处理程序。

There are two general guidelines as far as notifications and threading:

  • UI related notifications occur on the main thread.
  • Classes or methods that require you create and/or specify a queue will return notifications on that queue.

Beyond those two guidelines (and there are exceptions, which are noted in the reference documentation) you should not assume that a notification will be returned on any specific thread.


  • 在主线程上发生的与用户界面相关的通知。
  • 需要创建并且 / 或者 指定一个队列的类或者方法将返回该队列的通知。


If you’re writing a multithreaded application, you can use the NSThread method isMainThread or [[NSThread currentThread] isEqual:<#A stored thread reference#>] to test whether the invocation thread is a thread you expect to perform your work on. You can redirect messages to appropriate threads using methods such as performSelectorOnMainThread:withObject:waitUntilDone: and performSelector:onThread:withObject:waitUntilDone:modes:. You could also use dispatch_async to “bounce” to your blocks on an appropriate queue, either the main queue for UI tasks or a queue you have up for concurrent operations. For more about concurrent operations, see Concurrency Programming Guide; for more about blocks, see Blocks Programming Topics. The AVCam-iOS: Using AVFoundation to Capture Images and Movies sample code is considered the primary example for all AVFoundation functionality and can be consulted for examples of thread and queue usage with AVFoundation.

如果你在写一个多线程的应用程序,你可以使用 NSThread  方法 isMainThread 或者 [[NSThread currentThread] isEqual:<#A stored thread reference#>] 去测试是否调用了你期望执行你任务的线程。你可以使用方法重定向 消息给适合的线程,比如 performSelectorOnMainThread:withObject:waitUntilDone: 以及  performSelector:onThread:withObject:waitUntilDone:modes:. 你也可以使用 dispatch_async 弹回到适当队列的 blocks 中,无论是在主界面的任务队列还是有了并发操作的队列。更多关于并行操作,请查看 Concurrency Programming Guide;更多关于块,请查看 Blocks Programming Topics. AVCam-iOS: Using AVFoundation to Capture Images and Movies 示例代码是所有 AVFoundation 功能最主要的例子,可以对线程和队列使用 AVFoundation 实例参考。

Prerequisites – 预备知识

AVFoundation is an advanced Cocoa framework. To use it effectively, you must have:

  • A solid understanding of fundamental Cocoa development tools and techniques
  • A basic grasp of blocks
  • A basic understanding of key-value coding and key-value observing
  • For playback, a basic understanding of Core Animation (see Core Animation Programming Guide or, for basic playback, the AVKit Framework Reference.

AVFoundation 是一种先进的 Cocoa 框架,为了有效的使用,你必须掌握下面的知识:

  • 扎实的了解基本的 Cocoa 开发工具和框架
  • 对块有基本的了解
  • 了解基本的键值编码 (key-value coding) 和键值观察(key-value observing
  • 对于播放,对核心动画的基本理解 (see Core Animation Programming Guide ) 或者, 对于基本播放, 请看 AVKit Framework Reference.

See Also – 参考

There are several AVFoundation examples including two that are key to understanding and implementation Camera capture functionality:

  • AVCam-iOS: Using AVFoundation to Capture Images and Movies is the canonical sample code for implementing any program that uses the camera functionality. It is a complete sample, well documented, and covers the majority of the functionality showing the best practices.
  • AVCamManual: Extending AVCam to Use Manual Capture API is the companion application to AVCam. It implements Camera functionality using the manual camera controls. It is also a complete example, well documented, and should be considered the canonical example for creating camera applications that take advantage of manual controls.
  • RosyWriter is an example that demonstrates real time frame processing and in particular how to apply filters to video content. This is a very common developer requirement and this example covers that functionality.
  • AVLocationPlayer: Using AVFoundation Metadata Reading APIs demonstrates using the metadata APIs.

有几个 AVFoundation 的例子,包括两个理解和实现摄像头捕捉功能的关键点:

  • AVCam-iOS: Using AVFoundation to Capture Images and Movies 是实现任何想使用摄像头功能的程序的典型示例代码。它是一个完整的样本,以及记录,并涵盖了大部分主要的功能。
  • AVCamManual: Extending AVCam to Use Manual Capture API 是 AVCam 相对应的应用程序。它使用手动相机控制实现相机功能。它也是一个完成的例子,以及记录,并且应该被视为利用手动控制创建相机应用程序的典型例子。
  • RosyWriter 是一个演示实时帧处理的例子,特别是如果过滤器应用到视频内容。这是一个非常普遍的开发人员的需求,这个例子涵盖了这个功能。
  • AVLocationPlayer: 使用 AVFoundation Metadata Reading APIs 演示使用 the metadata APIs.

Using Assets – 使用 Assets

Assets can come from a file or from media in the user’s iPod library or Photo library. When you create an asset object all the information that you might want to retrieve for that item is not immediately available. Once you have a movie asset, you can extract still images from it, transcode it to another format, or trim the contents.

Assets 可以来自文件或者媒体用户的 iPod 库、图片库。当你创建一个 asset 对象时,所有你可能想要检索该项目的信息不是立即可用的。一旦你有了一个电影 asset ,你可以从里面提取静态图像,转换到另一个格式,或者对内容就行修剪。

Creating an Asset Object – 创建一个 Asset 对象

To create an asset to represent any resource that you can identify using a URL, you use AVURLAsset. The simplest case is creating an asset from a file:

为了创建一个 asset ,去代表任何你能用一个 URL 识别的资源,你可以使用 AVURLAsset . 最简单的情况是从一个文件创建一个 asset

NSURL *url = <#A URL that identifies an audiovisual asset such as a movie file#>;
AVURLAsset *anAsset = [[AVURLAsset alloc] initWithURL:url options:nil];

Options for Initializing an Asset – 初始化一个 Asset 的选择

The AVURLAsset initialization methods take as their second argument an options dictionary. The only key used in the dictionary is AVURLAssetPreferPreciseDurationAndTimingKey. The corresponding value is a Boolean (contained in an NSValue object) that indicates whether the asset should be prepared to indicate a precise duration and provide precise random access by time.

AVURLAsset 初始化方法作为它们的第二个参数选项字典。本字典中唯一被使用的 keyAVURLAssetPreferPreciseDurationAndTimingKey. 相应的值是一个布尔值(包含在一个 NSValue 对象中),这个布尔值指出是否该 asset 应该准备标出一个精确的时间和提供一个以时间为种子的随机存取。

Getting the exact duration of an asset may require significant processing overhead. Using an approximate duration is typically a cheaper operation and sufficient for playback. Thus:

  • If you only intend to play the asset, either pass nil instead of a dictionary, or pass a dictionary that contains the AVURLAssetPreferPreciseDurationAndTimingKey key and a corresponding value of NO (contained in an NSValue object).
  • If you want to add the asset to a composition (AVMutableComposition), you typically need precise random access. Pass a dictionary that contains theAVURLAssetPreferPreciseDurationAndTimingKey key and a corresponding value of YES (contained in an NSValue object—recall that NSNumberinherits from NSValue):

获得一个 asset 的确切持续时间可能需要大量的处理开销。使用一个近似的持续时间通常是一个更便宜的操作并且对于播放已经足够了。因此:

  • 如果你只打算播放这个 asset, 要么传递一个 nil 代替 dictionary ,或者传递一个字典,这个字典包含 AVURLAssetPreferPreciseDurationAndTimingKeykey和相应 NO(包含在一个 NSValue 对象) 的值。
  • 如果你想要把 asset 添加给一个 composition (AVMutableComposition), 通常你需要精确的随机存取。传递一个字典(这个字典包含 AVURLAssetPreferPreciseDurationAndTimingKey key) 和一个相应的 YES 的值(YES 包含在一个 NSValue 对象中,回忆一下继承自 NSValueNSNmuber
NSURL *url = <#A URL that identifies an audiovisual asset such as a movie file#>;
NSDictionary *options = @{ AVURLAssetPreferPreciseDurationAndTimingKey : @YES };
AVURLAsset *anAssetToUseInAComposition = [[AVURLAsset alloc] initWithURL:url options:options];

Accessing the User’s Assets – 访问用户的Assets

To access the assets managed by the iPod library or by the Photos application, you need to get a URL of the asset you want.

  • To access the iPod Library, you create an MPMediaQuery instance to find the item you want, then get its URL using MPMediaItemPropertyAssetURL.For more about the Media Library, see Multimedia Programming Guide.
  • To access the assets managed by the Photos application, you use ALAssetsLibrary.

The following example shows how you can get an asset to represent the first video in the Saved Photos Album.

为了访问由 iPod 库或者照片应用程序管理的 assets ,你需要得到你想要 asset 的一个 URL

下面的例子展示了如何获得一个 asset 来保存照片相册中的第一个视频。

ALAssetsLibrary *library = [[ALAssetsLibrary alloc] init];

// Enumerate just the photos and videos group by using ALAssetsGroupSavedPhotos.
[library enumerateGroupsWithTypes:ALAssetsGroupSavedPhotos usingBlock:^(ALAssetsGroup *group, BOOL *stop) {

    // Within the group enumeration block, filter to enumerate just videos.
    [group setAssetsFilter:[ALAssetsFilter allVideos]];

    // For this example, we're only interested in the first item.
    [group enumerateAssetsAtIndexes:[NSIndexSet indexSetWithIndex:0]
     usingBlock:^(ALAsset *alAsset, NSUInteger index, BOOL *innerStop) {

         // The end of the enumeration is signaled by asset == nil.
         if (alAsset) {
             ALAssetRepresentation *representation = [alAsset defaultRepresentation];
             NSURL *url = [representation url];
             AVAsset *avAsset = [AVURLAsset URLAssetWithURL:url options:nil];
             // Do something interesting with the AV asset.
 failureBlock: ^(NSError *error) {
     // Typically you should handle an error more gracefully than this.
     NSLog(@"No groups");

Preparing an Asset for Use – 将 Asset 准备好使用

Initializing an asset (or track) does not necessarily mean that all the information that you might want to retrieve for that item is immediately available. It may require some time to calculate even the duration of an item (an MP3 file, for example, may not contain summary information). Rather than blocking the current thread while a value is being calculated, you should use the AVAsynchronousKeyValueLoading protocol to ask for values and get an answer back later through a completion handler you define using a block. (AVAsset and AVAssetTrack conform to the AVAsynchronousKeyValueLoading protocol.)

初始化一个 asset (或者轨道)并不意味着你可能想要检索该项的所有信息是立即可用的。这可能需要一些时间来计算一个项目的持续时间(例如一个 MP3 文件可能不包含摘要信息)。当一个值被计算的时候不应该阻塞当前线程,你应该使用 AVAsynchronousKeyValueLoading 协议去请求值,通过完成处理你定义使用的一个 block 后得到答复。(AVAsset and AVAssetTrack 遵循 AVAsynchronousKeyValueLoading 协议.)

You test whether a value is loaded for a property using statusOfValueForKey:error:. When an asset is first loaded, the value of most or all of its properties is AVKeyValueStatusUnknown. To load a value for one or more properties, you invoke loadValuesAsynchronouslyForKeys:completionHandler:. In the completion handler, you take whatever action is appropriate depending on the property’s status. You should always be prepared for loading to not complete successfully, either because it failed for some reason such as a network-based URL being inaccessible, or because the load was canceled.

测试一个值是否是使用 statusOfValueForKey:error: 加载为一个属性。当 asset 被首次加载时,大部分的或全部属性值是 AVKeyValueStatusUnknown。为一个或多个属性加载一个值,调用 loadValuesAsynchronouslyForKeys:completionHandler:。在完成处理程序中,你采取的行动是否恰当,取决于属性的状态。你应该总是准备加载不会完全成功,它可能有一些原因,比如基于网络的 URL是无法访问的,或者因为负载被取消。

NSURL *url = <#A URL that identifies an audiovisual asset such as a movie file#>;
AVURLAsset *anAsset = [[AVURLAsset alloc] initWithURL:url options:nil];
NSArray *keys = @[@"duration"];

[asset loadValuesAsynchronouslyForKeys:keys completionHandler:^() {

  NSError *error = nil;
  AVKeyValueStatus tracksStatus = [asset statusOfValueForKey:@"duration" error:&error];
  switch (tracksStatus) {
    case AVKeyValueStatusLoaded:
      [self updateUserInterfaceForDuration];
    case AVKeyValueStatusFailed:
      [self reportError:error forAsset:asset];
    case AVKeyValueStatusCancelled:
      // Do whatever is appropriate for cancelation.

If you want to prepare an asset for playback, you should load its tracks property. For more about playing assets, see Playback.

如果你想准备一个 asset 去播放,你应该加载它的轨道属性。更多有关播放 assets,请看 Playback

Getting Still Images From a Video – 从视频中获取静态图像

To get still images such as thumbnails from an asset for playback, you use an AVAssetImageGenerator object. You initialize an image generator with your asset. Initialization may succeed, though, even if the asset possesses no visual tracks at the time of initialization, so if necessary you should test whether the asset has any tracks with the visual characteristic using tracksWithMediaCharacteristic:.

为了从一个准备播放的 asset 中得到静态图像,比如缩略图,可以使用 AVAssetImageGenerator 对象。用你的 asset 初始化一个图像发生器。不过即使 asset 进程在初始化的时候没有视觉跟踪,也可以成功,所以如果有必要,你应该测试一下, asset 是否有轨道有使用 tracksWithMediaCharacteristic 的视觉特征。

AVAsset anAsset = <#Get an asset#>;
if ([[anAsset tracksWithMediaType:AVMediaTypeVideo] count] > 0) {
    AVAssetImageGenerator *imageGenerator =
        [AVAssetImageGenerator assetImageGeneratorWithAsset:anAsset];
    // Implementation continues...

You can configure several aspects of the image generator, for example, you can specify the maximum dimensions for the images it generates and the aperture mode using maximumSize and apertureMode respectively.You can then generate a single image at a given time, or a series of images. You must ensure that you keep a strong reference to the image generator until it has generated all the images.

你可以配置几个图像发生器的部分,例如,可以指定生成的图像采用最大值,并且光圈的模式分别使用 maximumSizeapertureMode 。然后可以在给定的时间生成一个单独的图像,或者一系列图像。你必须确定,在生成所有图像之前,必须对图像生成器保持一个强引用。

Generating a Single Image – 生成一个单独的图像

You use copyCGImageAtTime:actualTime:error: to generate a single image at a specific time. AVFoundation may not be able to produce an image at exactly the time you request, so you can pass as the second argument a pointer to a CMTime that upon return contains the time at which the image was actually generated.

使用 copyCGImageAtTime:actualTime:error: 方法在指定时间生成一个图像。AVFoundation 在你要求的确切时间可能无法产生一个图像,所以你可以将一个指向 CMTime 的指针当做第二个参数穿过去,这个指针返回的时候包含图像被实际生成的时间。

AVAsset *myAsset = <#An asset#>];
AVAssetImageGenerator *imageGenerator = [[AVAssetImageGenerator alloc] initWithAsset:myAsset];

Float64 durationSeconds = CMTimeGetSeconds([myAsset duration]);
CMTime midpoint = CMTimeMakeWithSeconds(durationSeconds/2.0, 600);
NSError *error;
CMTime actualTime;

CGImageRef halfWayImage = [imageGenerator copyCGImageAtTime:midpoint actualTime:&actualTime error:&error];

if (halfWayImage != NULL) {

    NSString *actualTimeString = (NSString *)CMTimeCopyDescription(NULL, actualTime);
    NSString *requestedTimeString = (NSString *)CMTimeCopyDescription(NULL, midpoint);
    NSLog(@"Got halfWayImage: Asked for %@, got %@", requestedTimeString, actualTimeString);

    // Do something interesting with the image.

Generating a Sequence of Images – 生成一系列图像

To generate a series of images, you send the image generator a generateCGImagesAsynchronouslyForTimes:completionHandler: message. The first argument is an array of NSValue objects, each containing a CMTime structure, specifying the asset times for which you want images to be generated. The second argument is a block that serves as a callback invoked for each image that is generated. The block arguments provide a result constant that tells you whether the image was created successfully or if the operation was canceled, and, as appropriate:

  • The image
  • The time for which you requested the image and the actual time for which the image was generated
  • An error object that describes the reason generation failed

In your implementation of the block, check the result constant to determine whether the image was created. In addition, ensure that you keep a strong reference to the image generator until it has finished creating the images.

生成一系列图像,可以给图像生成器发送 generateCGImagesAsynchronouslyForTimes:completionHandler: 消息。第一个参数是一个 NSValue 对象的数组,每个都包含一个 CMTime 结构体,指定了图像想要被生成的 asset 时间。block 参数提供了一个结果,这个结果包含了告诉你是否图像被成功生成,或者操作某些情况下被取消。结果:

  • 图像
  • 你要求的图像和图像生成的实际时间
  • 一个 error 对象,描述了生成失败的原因

block 的实现中,检查结果常数,来确定图像是否被创建。此外,在完成创建图像之前,确保保持一个强引用给图像生成器。

AVAsset *myAsset = <#An asset#>];
// Assume: @property (strong) AVAssetImageGenerator *imageGenerator;
self.imageGenerator = [AVAssetImageGenerator assetImageGeneratorWithAsset:myAsset];

Float64 durationSeconds = CMTimeGetSeconds([myAsset duration]);
CMTime firstThird = CMTimeMakeWithSeconds(durationSeconds/3.0, 600);
CMTime secondThird = CMTimeMakeWithSeconds(durationSeconds*2.0/3.0, 600);
CMTime end = CMTimeMakeWithSeconds(durationSeconds, 600);
NSArray *times = @[NSValue valueWithCMTime:kCMTimeZero],
[NSValue valueWithCMTime:firstThird], [NSValue valueWithCMTime:secondThird],
[NSValue valueWithCMTime:end]];

[imageGenerator generateCGImagesAsynchronouslyForTimes:times
 completionHandler:^(CMTime requestedTime, CGImageRef image, CMTime actualTime,
                     AVAssetImageGeneratorResult result, NSError *error) {

     NSString *requestedTimeString = (NSString *)
         CFBridgingRelease(CMTimeCopyDescription(NULL, requestedTime));
     NSString *actualTimeString = (NSString *)
         CFBridgingRelease(CMTimeCopyDescription(NULL, actualTime));
     NSLog(@"Requested: %@; actual %@", requestedTimeString, actualTimeString);

     if (result == AVAssetImageGeneratorSucceeded) {
         // Do something interesting with the image.

     if (result == AVAssetImageGeneratorFailed) {
         NSLog(@"Failed with error: %@", [error localizedDescription]);
     if (result == AVAssetImageGeneratorCancelled) {

You can cancel the generation of the image sequence by sending the image generator a cancelAllCGImageGeneration message.

你发送给图像生成器一个 cancelAllCGImageGeneration 消息,可以取消队列中的图像生成。

Trimming and Transcoding a Movie – 微调和转化为一个电影

You can transcode a movie from one format to another, and trim a movie, using an AVAssetExportSession object. The workflow is shown in Figure 1-1. An export session is a controller object that manages asynchronous export of an asset. You initialize the session using the asset you want to export and the name of a export preset that indicates the export options you want to apply (see allExportPresets). You then configure the export session to specify the output URL and file type, and optionally other settings such as the metadata and whether the output should be optimized for network use.

asset一律使用 “资产” 代码,切换还要加“略麻烦

你可以使用 AVAssetExportSession 对象,将一个电影的编码进行转换,并且对电影进行微调。工作流程如图 1-1 所示。一个 export session 是一个控制器对象,管理一个资产的异步导出。使用想要导出的资产初始化一个 session 和输出设定的名称,这个输出设定表明你想申请的导出选项(allExportPresets)。然后配置导出会话去指定输出的 URL 和文件类型,以及其他可选的设定,比如元数据,是否将输出优化用于网络使用。

AVFoundation Programming Guide(官方文档翻译)完整版中英对照

You can check whether you can export a given asset using a given preset using exportPresetsCompatibleWithAsset: as illustrated in this example:

你可以检查你能否用给定的预设导出一个给定的资产,使用 exportPresetsCompatibleWithAsset: 作为示例。

AVAsset *anAsset = <#Get an asset#>;
NSArray *compatiblePresets = [AVAssetExportSession exportPresetsCompatibleWithAsset:anAsset];
if ([compatiblePresets containsObject:AVAssetExportPresetLowQuality]) {
    AVAssetExportSession *exportSession = [[AVAssetExportSession alloc]
                                           initWithAsset:anAsset presetName:AVAssetExportPresetLowQuality];
    // Implementation continues.

You complete the configuration of the session by providing the output URL (The URL must be a file URL.) AVAssetExportSession can infer the output file type from the URL’s path extension; typically, however, you set it directly using outputFileType. You can also specify additional properties such as the time range, a limit for the output file length, whether the exported file should be optimized for network use, and a video composition. The following example illustrates how to use the timeRange property to trim the movie:

完成会话的配置,是由输出的 URL (URL 必须是文件的 URL)控制的。AVAssetExportSession 可以从 URL 的路径延伸推断输出文件的类型。然而通常情况下,直接使用 outputFileType 设定。还可以指定附加属性,如时间范围、输出文件长度的限制、导出的文件是否应该为了网络使用而优化、还有一个视频的构成。下面的示例展示了如果使用 timeRange 属性修剪电影。

exportSession.outputURL = <#A file URL#>;
exportSession.outputFileType = AVFileTypeQuickTimeMovie;

CMTime start = CMTimeMakeWithSeconds(1.0, 600);
CMTime duration = CMTimeMakeWithSeconds(3.0, 600);
CMTimeRange range = CMTimeRangeMake(start, duration);
exportSession.timeRange = range;

To create the new file, you invoke exportAsynchronouslyWithCompletionHandler:. The completion handler block is called when the export operation finishes; in your implementation of the handler, you should check the session’s status value to determine whether the export was successful, failed, or was canceled:

调用 exportAsynchronouslyWithCompletionHandler: 创建新的文件。当导出操作完成的时候完成处理的 block 被调用,你应该检查会话的 status 值,去判断导出是否成功、失败或者被取消。

[exportSession exportAsynchronouslyWithCompletionHandler:^{

    switch ([exportSession status]) {
        case AVAssetExportSessionStatusFailed:
            NSLog(@"Export failed: %@", [[exportSession error] localizedDescription]);
        case AVAssetExportSessionStatusCancelled:
            NSLog(@"Export canceled");

You can cancel the export by sending the session a cancelExport message.

The export will fail if you try to overwrite an existing file, or write a file outside of the application’s sandbox. It may also fail if:

  • There is an incoming phone call
  • Your application is in the background and another application starts playback

In these situations, you should typically inform the user that the export failed, then allow the user to restart the export.

你可以通过给会话发送一个 cancelExport 消息来取消导出。


  • 有一个来电
  • 你的应用程序在后台并且另一个程序开始播放


Playback – 播放

To control the playback of assets, you use an AVPlayer object. During playback, you can use an AVPlayerItem instance to manage the presentation state of an asset as a whole, and an AVPlayerItemTrack object to manage the presentation state of an individual track. To display video, you use an AVPlayerLayer object.

使用 AVPlayer 对象控制资产的播放。在播放期间,可以使用一个 AVPlayerItem 实例去管理资产作为一个整体的显示状态,AVPlayerItemTrack 对象来管理一个单独轨道的显示状态。使用 AVPlayerLayer 显示视频。

Playing Assets – 播放资产

A player is a controller object that you use to manage playback of an asset, for example starting and stopping playback, and seeking to a particular time. You use an instance of AVPlayer to play a single asset. You can use an AVQueuePlayer object to play a number of items in sequence (AVQueuePlayer is a subclass of AVPlayer). On OS X you have the option of the using the AVKit framework’s AVPlayerView class to play the content back within a view.

播放器是一个控制器对象,使用这个控制器对象去管理一个资产的播放,例如开始和停止播放,并且追踪一个特定的时间。使用 AVPlayer 的实例去播放单个资产。可以使用 AVQueuePlayer 对象去播放在一些在队列的项目(AVQueuePlayerAVPlayer 的子类)。在 OS X 系统中,可以选择使用 AVKit 框架的 AVPlayerView 类去播放一个视图的内容。

A player provides you with information about the state of the playback so, if you need to, you can synchronize your user interface with the player’s state. You typically direct the output of a player to a specialized Core Animation layer (an instance of AVPlayerLayer or AVSynchronizedLayer). To learn more about layers, see Core Animation Programming Guide.

播放器提供了关于播放状态的信息,因此如果需要,可以将用户界面与播放器的状态同步。通常将播放器的输出指向专门的动画核心层(AVPlayerLayer 或者 AVSynchronizedLayer 的一个实例)。想要了解更多关于 layers,请看 Core Animation Programming Guide

Multiple player layers: You can create many AVPlayerLayer objects from a single AVPlayer instance, but only the most recently created such layer will display any video content onscreen.

多个播放器层:可以从一个单独的 AVPlayer 实例创建许多 AVPlayerLayer 对象,但是只有最近被创建的那一层将会屏幕上显示视频的内容。

Although ultimately you want to play an asset, you don’t provide assets directly to an AVPlayer object. Instead, you provide an instance of AVPlayerItem. A player item manages the presentation state of an asset with which it is associated. A player item contains player item tracks—instances of AVPlayerItemTrack—that correspond to the tracks in the asset. The relationship between the various objects is shown in Figure 2-1.

虽然最终想要播放一个资产,但又没有直接给提供资产一个 AVPlayer 对象。相反,提供一个 AVPlayerItem 的实例。一个 player item 管理与它相关的资产的显示状态。一个player item包含了播放器项目轨道 – AVPlayerItemTrack—that 的实例,对应资产内的轨道。各个对象之间的关系如图 2-1 所示。

AVFoundation Programming Guide(官方文档翻译)完整版中英对照

This abstraction means that you can play a given asset using different players simultaneously, but rendered in different ways by each player. Figure 2-2 shows one possibility, with two different players playing the same asset, with different settings. Using the item tracks, you can, for example, disable a particular track during playback (for example, you might not want to play the sound component).

这个摘要意味着可以同时使用不同的播放器播放一个给定的资产,但每个播放器都以不同的方式呈现。图 2-2 显示了一种可能性,同一个资产有两个不同的播放器,并且有不同的设定。可以使用不同的项目轨道,在播放期间禁用一个特定的轨道(例如,你可能不想播放这个声音组件)。

AVFoundation Programming Guide(官方文档翻译)完整版中英对照

You can initialize a player item with an existing asset, or you can initialize a player item directly from a URL so that you can play a resource at a particular location (AVPlayerItem will then create and configure an asset for the resource). As with AVAsset, though, simply initializing a player item doesn’t necessarily mean it’s ready for immediate playback. You can observe (using key-value observing) an item’s status property to determine if and when it’s ready to play.

可以用现有的资产初始化一个播放器项目,或者可以直接从一个 URL 初始化播放器项目,为了可以在一个特定位置播放一个资源(AVPlayerItem 将为资源创建和配置资产)。即使带着 AVAsset 简单地初始化一个播放器项目并不一定意味着它已经准备可以立即播放了。可以观察(使用 key-value observing])一个项目的 status 属性,以确定是否可以播放并且当已经准备好去播放。

Handling Different Types of Asset – 处理不同类型的资产

The way you configure an asset for playback may depend on the sort of asset you want to play. Broadly speaking, there are two main types: file-based assets, to which you have random access (such as from a local file, the camera roll, or the Media Library), and stream-based assets (HTTP Live Streaming format).

配置一个准备播放的资产的方法可能取决于你想播放的资产的顺序。概括地说,主要由两种类型:基于文件的资产,可以随机访问(例如从一个本地文件,相机胶卷,或者媒体库),和基于流的资产(HTTP 直播流媒体格式)。

To load and play a file-based asset. There are several steps to playing a file-based asset:

  • Create an asset using AVURLAsset.
  • Create an instance of AVPlayerItem using the asset.
  • Associate the item with an instance of AVPlayer.
  • Wait until the item’s status property indicates that it’s ready to play (typically you use key-value observing to receive a notification when the status changes).

This approach is illustrated in Putting It All Together: Playing a Video File Using AVPlayerLayer.

To create and prepare an HTTP live stream for playback. Initialize an instance of AVPlayerItem using the URL. (You cannot directly create an AVAsset instance to represent the media in an HTTP Live Stream.)


  • 使用 AVURLAsset 创建一个资产
  • 使用资产创建一个 AVPlayerItem 的实例
  • AVPlayer 的实例与项目联结
  • 等待,直到项目的 status 属性表明已经准备好播放了(通常当状态改变时,使用 key-value observing 接受通知)

该方法的说明都在:Putting It All Together: Playing a Video File Using AVPlayerLayer

创建和编写能够播放的 HTTP 直播流媒体。使用 URL 初始化一个 AVPlayerItem 的实例。(你不能直接创建一个 AVAsset 的实例去代表媒体在 HTTP 直播流中)

NSURL *url = [NSURL URLWithString:@"<#Live stream URL#>];
              // You may find a test stream at <http://devimages.apple.com/iphone/samples/bipbop/bipbopall.m3u8>.
              self.playerItem = [AVPlayerItem playerItemWithURL:url];
              [playerItem addObserver:self forKeyPath:@"status" options:0 context:&ItemStatusContext];
              self.player = [AVPlayer playerWithPlayerItem:playerItem];

When you associate the player item with a player, it starts to become ready to play. When it is ready to play, the player item creates the AVAsset and AVAssetTrack instances, which you can use to inspect the contents of the live stream.

To get the duration of a streaming item, you can observe the duration property on the player item. When the item becomes ready to play, this property updates to the correct value for the stream.

当你把播放项目和播放器联结起来时,它开始准备播放。当它准备播放时,播放项目创建 AVAssetAVAssetTrack 实例,可以用它来检查直播流的内容。

获取一个流项目的持续时间,可以观察播放项目的 duration 属性。当项目准备就绪时,这个属性更新为流的正确值。

Note: Using the duration property on the player item requires iOS 4.3 or later. An approach that is compatible with all versions of iOS involves observing the status property of the player item. When the status becomes AVPlayerItemStatusReadyToPlay, the duration can be fetched with the following line of code:

注意:在播放项目里使用 duration 属性要求 iOS4.3 ,或者更高的版本。一种方法是所有版本的 iOS 兼容包括播放项目的 status 属性。当 status 变成 AVPlayerItemStatusReadyToPlay,持续时间可以被下面的代码获取到:

[[[[[playerItem tracks] objectAtIndex:0] assetTrack] asset] duration];

If you simply want to play a live stream, you can take a shortcut and create a player directly using the URL use the following code:

如果你只是想播放一个直播流,你可以采取一种快捷方式,并使用 URL 直接创建一个播放器,代码如下:

self.player = [AVPlayer playerWithURL:<#Live stream URL#>];
[player addObserver:self forKeyPath:@"status" options:0 context:&PlayerStatusContext];

As with assets and items, initializing the player does not mean it’s ready for playback. You should observe the player’s status property, which changes to AVPlayerStatusReadyToPlay when it is ready to play. You can also observe the currentItem property to access the player item created for the stream.

作为资产和项目,初始化播放器并不意味着它已经准备就绪可以播放。你应该观察播放器的 status 属性,当准备就绪的时候改变 AVPlayerStatusReadyToPlay 。也可以观察 currentItem 属性去访问被流所创建播放项目。

If you don’t know what kind of URL you have, follow these steps:

  • Try to initialize an AVURLAsset using the URL, then load its tracks key.
    If the tracks load successfully, then you create a player item for the asset.
  • If 1 fails, create an AVPlayerItem directly from the URL.
    Observe the player’s status property to determine whether it becomes playable.

If either route succeeds, you end up with a player item that you can then associate with a player.

如果你不知道现有的 URL 是什么类型的,按照下面步骤:

  • 尝试用 URL 初始化一个 AVURLAsset ,然后将其加载为轨道的 key
  • 如果上一步失败,直接从 URL 创建一个 AVPlayerItem 。观察这个播放器的 status 属性来决定它是否是可播放的。


Playing an Item – 播放一个项目

To start playback, you send a play message to the player.


- (IBAction)play:sender {
    [player play];

In addition to simply playing, you can manage various aspects of the playback, such as the rate and the location of the playhead. You can also monitor the play state of the player; this is useful if you want to, for example, synchronize the user interface to the presentation state of the asset—see Monitoring Playback.

除了简单的播放,可以管理播放的各个方面,如速度和播放头的位置。也可以监视播放器的播放状态;这是很有用的,例如如果你想将用户界面同步到资产的呈现状态 – 详情看:Monitoring Playback.

Changing the Playback Rate – 改变播放的速率

You change the rate of playback by setting the player’s rate property.

通过发送播放器的 rate 属性来改变播放速率。

aPlayer.rate = 0.5;
aPlayer.rate = 2.0;

A value of 1.0 means “play at the natural rate of the current item”. Setting the rate to 0.0 is the same as pausing playback—you can also use pause.

值如果是 1.0 意味着 “当前项目按正常速率播放”。将速率设置为 0.0 就和暂停播放一样了 – 也可以使用 pause

Items that support reverse playback can use the rate property with a negative number to set the reverse playback rate. You determine the type of reverse play that is supported by using the playerItem properties canPlayReverse (supports a rate value of -1.0), canPlaySlowReverse (supports rates between 0.0 and 1.0) and canPlayFastReverse (supports rate values less than -1.0).

支持逆向播放的项目可以使用带有负数 rate 属性,负数可以设置反向播放速率。确定反向播放的类型,通过使用 playerItem 属性 canPlayReverse (支持一个速率值 -1.0),canPlaySlowReverse (速率支持0.01.0)和 canPlayFastReverse (速率值可以小于 -1.0)。

Seeking—Repositioning the Playhead – 寻找 – 重新定位播放头

To move the playhead to a particular time, you generally use seekToTime: as follows:

通常使用 seekToTime: 把播放头移动到一个指定的时间,示例:

CMTime fiveSecondsIn=CMTimeMake(5, 1);
[player seekToTime:fiveSecondsIn];

The seekToTime: method, however, is tuned for performance rather than precision. If you need to move the playhead precisely, instead you use seekToTime:toleranceBefore:toleranceAfter: as in the following code fragment:

然而 seekToTime: 方法是为了性能的调试,而不是精度。如果你需要精确的移动播放头,你需要使用 seekToTime:toleranceBefore:toleranceAfter: 代替,示例代码:

CMTime fiveSecondsIn=CMTimeMake(5, 1);
[player seekToTime:fiveSecondsIn toleranceBefore:kCMTimeZero toleranceAfter:kCMTimeZero];

Using a tolerance of zero may require the framework to decode a large amount of data. You should use zero only if you are, for example, writing a sophisticated media editing application that requires precise control.

After playback, the player’s head is set to the end of the item and further invocations of play have no effect. To position the playhead back at the beginning of the item, you can register to receive an AVPlayerItemDidPlayToEndTimeNotification notification from the item. In the notification’s callback method, you invoke seekToTime: with the argument kCMTimeZero.


播放之后,播放器的头被设置在项目的结尾处,接着进行播放的调用没有任何影响。将播放头放置在项目的开始位置,可以注册从项目接收一个 AVPlayerItemDidPlayToEndTimeNotification 消息。在消息的回调方法中,调用带着参数 kCMTimeZeroseekToTime: 方法。

// Register with the notification center after creating the player item.
[[NSNotificationCenter defaultCenter]
 object:<#The player item#>];

- (void)playerItemDidReachEnd:(NSNotification *)notification {
    [player seekToTime:kCMTimeZero];

Playing Multiple Items – 播放多个项目

You can use an AVQueuePlayer object to play a number of items in sequence. The AVQueuePlayer class is a subclass of AVPlayer. You initialize a queue player with an array of player items.

可以使用 AVQueuePlayer 对象去播放队列中的一些项目。AVQueuePlayer 类是 AVPlayer 的子类。初始化一个带着播放项目数组的队列播放器:

NSArray *items = <#An array of player items#>;
AVQueuePlayer *queuePlayer = [[AVQueuePlayer alloc] initWithItems:items];

You can then play the queue using play, just as you would an AVPlayer object. The queue player plays each item in turn. If you want to skip to the next item, you send the queue player an advanceToNextItem message.

可以使用 play 播放队列,就像你是一个 AVPlayer 对象。队列播放器依次播放每个项目。如果想要跳过这一项,给队列播放器发送一个 advanceToNextItem 信息。

You can modify the queue using insertItem:afterItem:, removeItem:, and removeAllItems. When adding a new item, you should typically check whether it can be inserted into the queue, using canInsertItem:afterItem:. You pass nil as the second argument to test whether the new item can be appended to the queue.

可以使用 insertItem:afterItem:removeItem:removeAllItems 这三个方法修改队列。当添加一个新项目,通常应该检查它是否可以被插入到队列中,使用 canInsertItem:afterItem:。传 nil 作为第二个参数去测试是否将新项目添加到队列中。

AVPlayerItem *anItem = <#Get a player item#>;
if ([queuePlayer canInsertItem:anItem afterItem:nil]) {
    [queuePlayer insertItem:anItem afterItem:nil];

Monitoring Playback – 监视播放

You can monitor a number of aspects of both the presentation state of a player and the player item being played. This is particularly useful for state changes that are not under your direct control. For example:

  • If the user uses multitasking to switch to a different application, a player’s rate property will drop to 0.0.
  • If you are playing remote media, a player item’s loadedTimeRanges and seekableTimeRanges properties will change as more data becomes available.

These properties tell you what portions of the player item’s timeline are available.

  • A player’s currentItem property changes as a player item is created for an HTTP live stream.
  • A player item’s tracks property may change while playing an HTTP live stream.

This may happen if the stream offers different encodings for the content; the tracks change if the player switches to a different encoding.

  • A player or player item’s status property may change if playback fails for some reason.

You can use key-value observing to monitor changes to values of these properties.


  • 如果用户使用多任务处理切换到另一个应用程序,播放器的 rate 属性将下降到 0.0
  • 如果正在播放远程媒体,播放项目的 loadedTimeRangesseekableTimeRanges 属性将会改变使得更多的数据成为可用的。


  • 播放器的 currentItem 属性变化,随着播放项目被 HTTP 直播流创建。
  • 当播放 HTTP 直播流时,播放项目的 tracks 属性可能会改变。


  • 如果因为一些原因播放失败,播放器或者播放项目的 status 属性可能会改变。

可以使用 key-value observing 去监视这些属性值的改变。

Important: You should register for KVO change notifications and unregister from KVO change notifications on the main thread. This avoids the possibility of receiving a partial notification if a change is being made on another thread. AV Foundation invokes observeValueForKeyPath:ofObject:change:context: on the main thread, even if the change operation is made on another thread.

重要的是:你应该对 KVO 改变通知登记,从主线程中 KVO 改变通知而注销。如果在另一个线程上正在更改,这避免了只接受到部分通知的可能性。AV Foundation 在主线程中调用 observeValueForKeyPath:ofObject:change:context: ,即使改变操作是在另一个线程中。

Responding to a Change in Status – 响应状态的变化

When a player or player item’s status changes, it emits a key-value observing change notification. If an object is unable to play for some reason (for example, if the media services are reset), the status changes to AVPlayerStatusFailed or AVPlayerItemStatusFailed as appropriate. In this situation, the value of the object’s error property is changed to an error object that describes why the object is no longer be able to play.

当一个播放器或者播放项目的 status 改变,它会发出一个 key-value observing 改变通知。如果一个对象由于一些原因不能播放(例如,如果媒体服务器复位),status 适当的改变为 AVPlayerStatusFailed 或者 AVPlayerItemStatusFailed。在这种情况下,对象的 error 属性的值被更改为一个错误对象,该对象描述了为什么对象不能播放了。

AV Foundation does not specify what thread that the notification is sent on. If you want to update the user interface, you must make sure that any relevant code is invoked on the main thread. This example uses dispatch_async to execute code on the main thread.

AV Foundation 没有指定通知发送的是什么线程。如果要更新用户界面,必须确保相关的代码都是在主线程被调用的。这个例子使用了 dispatch_async 去执行在主线程中的代码。

- (void)observeValueForKeyPath:(NSString *)keyPath ofObject:(id)object
    change:(NSDictionary *)change context:(void *)context {

        if (context == <#Player status context#>) {
            AVPlayer *thePlayer = (AVPlayer *)object;
            if ([thePlayer status] == AVPlayerStatusFailed) {
                NSError *error = [<#The AVPlayer object#> error];
                // Respond to error: for example, display an alert sheet.
            // Deal with other status change if appropriate.
        // Deal with other change notifications if appropriate.
        [super observeValueForKeyPath:keyPath ofObject:object
         change:change context:context];

Tracking Readiness for Visual Display – 为视觉展示做追踪准备

You can observe an AVPlayerLayer object’s readyForDisplay property to be notified when the layer has user-visible content. In particular, you might insert the player layer into the layer tree only when there is something for the user to look at and then perform a transition from.

可以观察一个 AVPlayerLayer 对象的 readyForDisplay 属性,当层有了用户可见的内容时属性可以被通知。特别是,可能将播放器层插入到层树,只有当有东西给用户看的时候,在从里面执行一个转变。

Tracking Time – 追踪时间

To track changes in the position of the playhead in an AVPlayer object, you can use addPeriodicTimeObserverForInterval:queue:usingBlock: or addBoundaryTimeObserverForTimes:queue:usingBlock:. You might do this to, for example, update your user interface with information about time elapsed or time remaining, or perform some other user interface synchronization.

  • With addPeriodicTimeObserverForInterval:queue:usingBlock:, the block you provide is invoked at the interval you specify, if time jumps, and when playback starts or stops.
  • With addBoundaryTimeObserverForTimes:queue:usingBlock:, you pass an array of CMTime structures contained in NSValue objects. The block you provide is invoked whenever any of those times is traversed.

追踪一个 AVPlayer 对象中播放头位置的变化,可以使用 addPeriodicTimeObserverForInterval:queue:usingBlock: 或者 addBoundaryTimeObserverForTimes:queue:usingBlock: 。可以这样做,例如更新用户界面与时间消耗或者剩余时间的有关信息,或者执行一些其他用户界面的同步。

Both of the methods return an opaque object that serves as an observer. You must keep a strong reference to the returned object as long as you want the time observation block to be invoked by the player. You must also balance each invocation of these methods with a corresponding call to removeTimeObserver:.

With both of these methods, AV Foundation does not guarantee to invoke your block for every interval or boundary passed. AV Foundation does not invoke a block if execution of a previously invoked block has not completed. You must make sure, therefore, that the work you perform in the block does not overly tax the system.

这两种方法都返回一个作为观察者的不透明对象。只要你希望播放器调用时间观察的块,就必须对返回的对象保持一个强引用。你也必须平衡每次调用这些方法,与相应的调用 removeTimeObserver:.

有了这两种方法, AV Foundation 不保证每个间隔或者通过边界时都调用你的块。如果以前调用的块执行没有完成,AV Foundation不会调用块。因此必须确保你在该块中执行的工作不会对系统过载。

// Assume a property: @property (strong) id playerObserver;

Float64 durationSeconds = CMTimeGetSeconds([<#An asset#> duration]);
CMTime firstThird = CMTimeMakeWithSeconds(durationSeconds/3.0, 1);
CMTime secondThird = CMTimeMakeWithSeconds(durationSeconds*2.0/3.0, 1);
NSArray *times = @[[NSValue valueWithCMTime:firstThird], [NSValue valueWithCMTime:secondThird]];

self.playerObserver = [<#A player#> addBoundaryTimeObserverForTimes:times queue:NULL usingBlock:^{

    NSString *timeDescription = (NSString *)
        CFBridgingRelease(CMTimeCopyDescription(NULL, [self.player currentTime]));
    NSLog(@"Passed a boundary at %@", timeDescription);

Reaching the End of an Item – 到达一个项目的结束

You can register to receive an AVPlayerItemDidPlayToEndTimeNotification notification when a player item has completed playback.

当一个播放项目已经完成播放的时候,可以注册接收一个 AVPlayerItemDidPlayToEndTimeNotification 通知。

[[NSNotificationCenter defaultCenter] addObserver:<#The observer, typically self#> selector:@selector(<#The selector name#>) name:AVPlayerItemDidPlayToEndTimeNotification object:<#A player item#>];

Putting It All Together: Playing a Video File Using AVPlayerLayer – 总而言之,使用 AVPlayerLayer 播放视频文件

This brief code example illustrates how you can use an AVPlayer object to play a video file. It shows how to:

  • Configure a view to use an AVPlayerLayer layer
  • Create an AVPlayer object
  • Create an AVPlayerItem object for a file-based asset and use key-value observing to observe its status
  • Respond to the item becoming ready to play by enabling a button
  • Play the item and then restore the player’s head to the beginning

这个简短的代码示例演示如何使用一个 AVPlayer 对象播放一个视频文件。它显示了如何:

  • 使用 AVPlayerLayer 层配置视图
  • 创建一个 AVPlayer 对象
  • 创建一个基于文件资产的 AVPlayerItem 对象和使用 key-value observing 去观察它的状态
  • 通过启用按钮来响应项目准备就绪播放
  • 播放项目,然后将播放器的头重置到开始位置

Note: To focus on the most relevant code, this example omits several aspects of a complete application, such as memory management and unregistering as an observer (for key-value observing or for the notification center). To use AV Foundation, you are expected to have enough experience with Cocoa to be able to infer the missing pieces.

注意:关注最相关的代码,这个例子中省略了一个完整应用程序的几个方面,比如内存管理和注销观察者(key-value observing 或者 notification center)。为了使用 AV Foundation ,你应该有足够的 Cocoa 经验,有能力去推断出丢失的碎片。

For a conceptual introduction to playback, skip to Playing Assets.

对于播放的概念性的介绍,跳去看 Playing Assets

The Player View – 播放器视图

To play the visual component of an asset, you need a view containing an AVPlayerLayer layer to which the output of an AVPlayer object can be directed. You can create a simple subclass of UIView to accommodate this:

播放一个资产的可视化部分,需要一个包含了 AVPlayerLayer 层的视图,AVPlayerLayer 层可以直接输出 AVPlayer 对象。可以创建一个 UIView 的简单子类来容纳:

#import <UIKit/UIKit.h>
#import <AVFoundation/AVFoundation.h>

@interface PlayerView : UIView
@property (nonatomic) AVPlayer *player;

@implementation PlayerView
    + (Class)layerClass {
    return [AVPlayerLayer class];
- (AVPlayer*)player {
    return [(AVPlayerLayer *)[self layer] player];
- (void)setPlayer:(AVPlayer *)player {
    [(AVPlayerLayer *)[self layer] setPlayer:player];

A Simple View Controller – 一个简单的 View Controller

Assume you have a simple view controller, declared as follows:

假设你有一个简单的 view controller,声明如下:

@class PlayerView;
@interface PlayerViewController : UIViewController

@property (nonatomic) AVPlayer *player;
@property (nonatomic) AVPlayerItem *playerItem;
@property (nonatomic, weak) IBOutlet PlayerView *playerView;
@property (nonatomic, weak) IBOutlet UIButton *playButton;
- (IBAction)loadAssetFromFile:sender;
- (IBAction)play:sender;
- (void)syncUI;

The syncUI method synchronizes the button’s state with the player’s state:

syncUI 方法同步按钮状态和播放器的状态:

- (void)syncUI {
    if ((self.player.currentItem != nil) &&
        ([self.player.currentItem status] == AVPlayerItemStatusReadyToPlay)) {
        self.playButton.enabled = YES;
    else {
        self.playButton.enabled = NO;

You can invoke syncUI in the view controller’s viewDidLoad method to ensure a consistent user interface when the view is first displayed.

当视图第一次显示的时候,可以在视图控制器的 viewDidLoad 方法中调用 invoke 去确保用户界面的一致性。

- (void)viewDidLoad {
    [super viewDidLoad];
    [self syncUI];

The other properties and methods are described in the remaining sections.


Creating the Asset – 创建一个资产

You create an asset from a URL using AVURLAsset. (The following example assumes your project contains a suitable video resource.)

使用 AVURLAsset 从一个 URL 创建一个资产。(下面的例子假设你的工程包含了一个合适的视频资源)

- (IBAction)loadAssetFromFile:sender {

    NSURL *fileURL = [[NSBundle mainBundle]
                      URLForResource:<#@"VideoFileName"#> withExtension:<#@"extension"#>];

    AVURLAsset *asset = [AVURLAsset URLAssetWithURL:fileURL options:nil];
    NSString *tracksKey = @"tracks";

    [asset loadValuesAsynchronouslyForKeys:@[tracksKey] completionHandler:
         // The completion block goes here.

In the completion block, you create an instance of AVPlayerItem for the asset and set it as the player for the player view. As with creating the asset, simply creating the player item does not mean it’s ready to use. To determine when it’s ready to play, you can observe the item’s status property. You should configure this observing before associating the player item instance with the player itself.

You trigger the player item’s preparation to play when you associate it with the player.

在完成块中,为资产创建一个 AVPlayerItem 的实例,并设置它为播放页面的播放器。与创建资产一样,简单地创建播放器项目并不意味着它已经准备好使用。为了确定它已经准备好了,可以观察项目的 status 属性。你应该在该播放器项目实例与播放器本身关联之前,配置这个 observing


// Define this constant for the key-value observation context.
static const NSString *ItemStatusContext;

// Completion handler block.
                   NSError *error;
                   AVKeyValueStatus status = [asset statusOfValueForKey:tracksKey error:&error];

                   if (status == AVKeyValueStatusLoaded) {
                       self.playerItem = [AVPlayerItem playerItemWithAsset:asset];
                       // ensure that this is done before the playerItem is associated with the player
                       [self.playerItem addObserver:self forKeyPath:@"status"
                        options:NSKeyValueObservingOptionInitial context:&ItemStatusContext];
                       [[NSNotificationCenter defaultCenter] addObserver:self
                       self.player = [AVPlayer playerWithPlayerItem:self.playerItem];
                       [self.playerView setPlayer:self.player];
                   else {
                       // You should deal with the error appropriately.
                       NSLog(@"The asset's tracks were not loaded:\n%@", [error localizedDescription]);

Responding to the Player Item’s Status Change – 相应播放项目的状态改变

When the player item’s status changes, the view controller receives a key-value observing change notification. AV Foundation does not specify what thread that the notification is sent on. If you want to update the user interface, you must make sure that any relevant code is invoked on the main thread. This example uses dispatch_async to queue a message on the main thread to synchronize the user interface.

当播放项目的状态改变时,视图控制器接收一个 key-value observing 改变通知。AV Foundation 没有指定通知发送的是什么线程。如果你想更新用户界面,必须确保任何相关的代码都要在主线程中调用。这个例子使用 dispatch_async 让主线程同步用户界面的消息进入队列。

Playing the Item – 播放项目

Playing the item involves sending a play message to the player.


- (IBAction)play:sender {
    [player play];

The item is played only once. After playback, the player’s head is set to the end of the item, and further invocations of the play method will have no effect. To position the playhead back at the beginning of the item, you can register to receive an AVPlayerItemDidPlayToEndTimeNotification from the item. In the notification’s callback method, invoke seekToTime: with the argument kCMTimeZero.

该项目只播放一次。播放之后,播放器的头被设置在项目的结束位置,播放方法进一步调用将没有效果。将播放头放在项目的开始,可以注册从项目去接收 AVPlayerItemDidPlayToEndTimeNotification。在通知的回调方法,调用带着参数 kCMTimeZeroseekToTime: 方法。

// Register with the notification center after creating the player item.
[[NSNotificationCenter defaultCenter] addObserver:self
 object:[self.player currentItem]];

- (void)playerItemDidReachEnd:(NSNotification *)notification {
    [self.player seekToTime:kCMTimeZero];

Editing – 编辑

The AVFoundation framework provides a feature-rich set of classes to facilitate the editing of audio visual assets. At the heart of AVFoundation’s editing API are compositions. A composition is simply a collection of tracks from one or more different media assets. The AVMutableComposition class provides an interface for inserting and removing tracks, as well as managing their temporal orderings. Figure 3-1 shows how a new composition is pieced together from a combination of existing assets to form a new asset. If all you want to do is merge multiple assets together sequentially into a single file, that is as much detail as you need. If you want to perform any custom audio or video processing on the tracks in your composition, you need to incorporate an audio mix or a video composition, respectively.

AVFoundation 框架提供了一个功能丰富的类集合去帮助音视频资产的编辑。 AVFoundation
编辑 API 的核心是一些组合。一种组合物是简单的一个或者多个不同媒体资产的轨道的集合。AVMutableComposition 类提供一个可以插入和移除轨道的接口,以及管理它们的时间序列。图 3-1 显示了一个新的组合是怎样从一些现有的资产拼凑起来,形成新的资产。如果你想做的是将多个资产合并为一个单一的文件,这里有尽可能多的你需要掌握的细节。如果你想在你的作品中的轨道上执行任何自定义音频或视频处理,你需要分别将一个音频组合或者视频组成。

AVFoundation Programming Guide(官方文档翻译)完整版中英对照

Using the AVMutableAudioMix class, you can perform custom audio processing on the audio tracks in your composition, as shown in Figure 3-2. Currently, you can specify a maximum volume or set a volume ramp for an audio track.

使用 AVMutableAudioMix 类,可以在你作品的音频轨道中执行自定义处理,如图 3-2 所示。目前,你可以指定一个最大音量或设置一个音频轨道的音量斜坡

AVFoundation Programming Guide(官方文档翻译)完整版中英对照

You can use the AVMutableVideoComposition class to work directly with the video tracks in your composition for the purposes of editing, shown in Figure 3-3. With a single video composition, you can specify the desired render size and scale, as well as the frame duration, for the output video. Through a video composition’s instructions (represented by the AVMutableVideoCompositionInstruction class), you can modify the background color of your video and apply layer instructions. These layer instructions (represented by the AVMutableVideoCompositionLayerInstruction class) can be used to apply transforms, transform ramps, opacity and opacity ramps to the video tracks within your composition. The video composition class also gives you the ability to introduce effects from the Core Animation framework into your video using the animationTool property.

可以使用 AVMutableVideoComposition 类直接在视频中跟踪你想编辑的部分,如图 3-3 所示。一个单一的视频组件,可以为输出视频指定所需的渲染大小和规模,以及帧的持续时间。通过视频组件的指令(以 AVMutableVideoCompositionInstruction 类为代表),你可以修改视频的背景颜色和应用层的指令。这些层的指令(以 AVMutableVideoCompositionLayerInstruction 类为代表)可以可应用于应用变换,变换坡道,不透明度以及不透明度的坡道到你的组件中的视频轨道。视频组件类也能让你做一些事,从核心动画框架到使用 animationTool 属性的视频。

AVFoundation Programming Guide(官方文档翻译)完整版中英对照

To combine your composition with an audio mix and a video composition, you use an AVAssetExportSession object, as shown in Figure 3-4. You initialize the export session with your composition and then simply assign your audio mix and video composition to the audioMix and videoComposition properties respectively.

将音频和视频的成分组合,可以使用 AVAssetExportSession 对象,如图 3-4 所所示。初始化导出会话,然后简单的分别将音频部分和视频组件分配给 audioMixvideoComposition 属性。

AVFoundation Programming Guide(官方文档翻译)完整版中英对照

Creating a Composition – 创建组件

To create your own composition, you use the AVMutableComposition class. To add media data to your composition, you must add one or more composition tracks, represented by the AVMutableCompositionTrack class. The simplest case is creating a mutable composition with one video track and one audio track:

使用 AVMutableComposition 类创建自己的组件。在你的组件中添加媒体数据,必须添加一个或者多个组件轨道,以 AVMutableCompositionTrack 类为代表。最简单的例子创建一个有一个音频轨道和一个视频轨道的可变组件。

AVMutableComposition *mutableComposition = [AVMutableComposition composition];
// Create the video composition track.
AVMutableCompositionTrack *mutableCompositionVideoTrack = [mutableComposition addMutableTrackWithMediaType:AVMediaTypeVideo preferredTrackID:kCMPersistentTrackID_Invalid];
// Create the audio composition track.
AVMutableCompositionTrack *mutableCompositionAudioTrack = [mutableComposition addMutableTrackWithMediaType:AVMediaTypeAudio preferredTrackID:kCMPersistentTrackID_Invalid];

Options for Initializing a Composition Track – 初始化组件轨道的选项

When adding new tracks to a composition, you must provide both a media type and a track ID. Although audio and video are the most commonly used media types, you can specify other media types as well, such as AVMediaTypeSubtitle or AVMediaTypeText.

Every track associated with some audiovisual data has a unique identifier referred to as a track ID. If you specify kCMPersistentTrackID_Invalid as the preferred track ID, a unique identifier is automatically generated for you and associated with the track.

当给轨道添加一个新的轨道时,必须提供媒体类型和轨道 ID 。虽然音频和视频是最常用的媒体类型,你可以指定其他媒体类型,比如 AVMediaTypeSubtitle 或者 AVMediaTypeText

每个和视听数据相关联的轨道都有一个唯一的标示符,叫做 track ID。如果你指定了 kCMPersistentTrackID_Invalid 作为首先的 track ID,将会为你生成一个唯一的标示符并且与轨道相关联。

Adding Audiovisual Data to a Composition – 将视听数据添加到一个组件中

Once you have a composition with one or more tracks, you can begin adding your media data to the appropriate tracks. To add media data to a composition track, you need access to the AVAsset object where the media data is located. You can use the mutable composition track interface to place multiple tracks with the same underlying media type together on the same track. The following example illustrates how to add two different video asset tracks in sequence to the same composition track:

一旦有带着一个或多个轨道的组件,就可以把你的媒体数据添加到适当的轨道中。为了将媒体数据添加到组件轨道,需要访问媒体数据所在位置的 AVAsset 对象。可以使用可变组件轨道接口将有相同基础的媒体类型的多个轨道放置到一个轨道上。下面的示例演示了如何将一个队列中两个不同的音频资产轨道添加到同一个组件轨道中。

// You can retrieve AVAssets from a number of places, like the camera roll for example.
AVAsset *videoAsset = <#AVAsset with at least one video track#>;
AVAsset *anotherVideoAsset = <#another AVAsset with at least one video track#>;
// Get the first video track from each asset.
AVAssetTrack *videoAssetTrack = [[videoAsset tracksWithMediaType:AVMediaTypeVideo] objectAtIndex:0];
AVAssetTrack *anotherVideoAssetTrack = [[anotherVideoAsset tracksWithMediaType:AVMediaTypeVideo] objectAtIndex:0];
// Add them both to the composition.
[mutableCompositionVideoTrack insertTimeRange:CMTimeRangeMake(kCMTimeZero,videoAssetTrack.timeRange.duration) ofTrack:videoAssetTrack atTime:kCMTimeZero error:nil];
[mutableCompositionVideoTrack insertTimeRange:CMTimeRangeMake(kCMTimeZero,anotherVideoAssetTrack.timeRange.duration) ofTrack:anotherVideoAssetTrack atTime:videoAssetTrack.timeRange.duration error:nil];

Retrieving Compatible Composition Tracks – 检索兼容的组件轨道

Where possible, you should have only one composition track for each media type. This unification of compatible asset tracks leads to a minimal amount of resource usage. When presenting media data serially, you should place any media data of the same type on the same composition track. You can query a mutable composition to find out if there are any composition tracks compatible with your desired asset track:


AVMutableCompositionTrack *compatibleCompositionTrack = [mutableComposition mutableTrackCompatibleWithTrack:<#the AVAssetTrack you want to insert#>];
if (compatibleCompositionTrack) {
    // Implementation continues.

Note: Placing multiple video segments on the same composition track can potentially lead to dropping frames at the transitions between video segments, especially on embedded devices. Choosing the number of composition tracks for your video segments depends entirely on the design of your app and its intended platform.


Generating a Volume Ramp – 生成一个音量坡度

A single AVMutableAudioMix object can perform custom audio processing on all of the audio tracks in your composition individually. You create an audio mix using the audioMix class method, and you use instances of the AVMutableAudioMixInputParameters class to associate the audio mix with specific tracks within your composition. An audio mix can be used to vary the volume of an audio track. The following example displays how to set a volume ramp on a specific audio track to slowly fade the audio out over the duration of the composition:

一个单独的 AVMutableAudioMix 对象可以分别执行自定义音频,处理组件中的所有轨道。可以使用 audioMix 类方法创建一个音频混合,使用 AVMutableAudioMixInputParameters 类的实例将混合音频与组件中指定的轨道联结起来。一个混合音频可以用来改变音频轨道的音量。下面的例子展示了,如何在一个指定的音频轨道设置一个音量坡度,使得在组件的持续时间让音频缓慢淡出:

AVMutableAudioMix *mutableAudioMix = [AVMutableAudioMix audioMix];
// Create the audio mix input parameters object.
AVMutableAudioMixInputParameters *mixParameters = [AVMutableAudioMixInputParameters audioMixInputParametersWithTrack:mutableCompositionAudioTrack];
// Set the volume ramp to slowly fade the audio out over the duration of the composition.
[mixParameters setVolumeRampFromStartVolume:1.f toEndVolume:0.f timeRange:CMTimeRangeMake(kCMTimeZero, mutableComposition.duration)];
// Attach the input parameters to the audio mix.
mutableAudioMix.inputParameters = @[mixParameters];

Performing Custom Video Processing – 执行自定义配置

As with an audio mix, you only need one AVMutableVideoComposition object to perform all of your custom video processing on your composition’s video tracks. Using a video composition, you can directly set the appropriate render size, scale, and frame rate for your composition’s video tracks. For a detailed example of setting appropriate values for these properties, see Setting the Render Size and Frame Duration.

作为一个混合音频,只需要一个 AVMutableVideoComposition 对象就可以执行组件音频轨道中的所有自定义音频配置。使用一个音频组件,可以直接为组件音频轨道设置适当的渲染大小,规模以及帧速率。有一个设置这些属性值的详细的示例,请看 Setting the Render Size and Frame Duration

Changing the Composition’s Background Color – 改变组件的背景颜色

All video compositions must also have an array of AVVideoCompositionInstruction objects containing at least one video composition instruction. You use the AVMutableVideoCompositionInstruction class to create your own video composition instructions. Using video composition instructions, you can modify the composition’s background color, specify whether post processing is needed or apply layer instructions.

The following example illustrates how to create a video composition instruction that changes the background color to red for the entire composition.

所有的视频组件必须有一个 AVVideoCompositionInstruction 对象的数组,每个对象至少包含一个视频组件指令。使用 AVMutableVideoCompositionInstruction 类去创建自己的视频组件指令。使用视频组件指令,可以修改组件的背景颜色,指定是否需要处理推迟处理或者应用到层指令。


AVMutableVideoCompositionInstruction *mutableVideoCompositionInstruction = [AVMutableVideoCompositionInstruction videoCompositionInstruction];
mutableVideoCompositionInstruction.timeRange = CMTimeRangeMake(kCMTimeZero, mutableComposition.duration);
mutableVideoCompositionInstruction.backgroundColor = [[UIColor redColor] CGColor];

Applying Opacity Ramps – 应用不透明的坡道

Video composition instructions can also be used to apply video composition layer instructions. An AVMutableVideoCompositionLayerInstruction object can apply transforms, transform ramps, opacity and opacity ramps to a certain video track within a composition. The order of the layer instructions in a video composition instruction’s layerInstructions array determines how video frames from source tracks should be layered and composed for the duration of that composition instruction. The following code fragment shows how to set an opacity ramp to slowly fade out the first video in a composition before transitioning to the second video:

视频组件指令可以用于视频组件层指令。一个 AVMutableVideoCompositionLayerInstruction 对象可以应用转换,转换坡道,不透明度和坡道的不透明度到某个组件内的视频轨道。视频组件指令的 layerInstructions 数组中 层指令的顺序决定了组件指令期间,资源轨道中的视频框架应该如何被应用和组合。下面的代码展示了如何设置一个不透明的坡度使得第二个视频之前,让第一个视频慢慢淡出:

AVAsset *firstVideoAssetTrack = <#AVAssetTrack representing the first video segment played in the composition#>;
AVAsset *secondVideoAssetTrack = <#AVAssetTrack representing the second video segment played in the composition#>;
// Create the first video composition instruction.
AVMutableVideoCompositionInstruction *firstVideoCompositionInstruction = [AVMutableVideoCompositionInstruction videoCompositionInstruction];
// Set its time range to span the duration of the first video track.
firstVideoCompositionInstruction.timeRange = CMTimeRangeMake(kCMTimeZero, firstVideoAssetTrack.timeRange.duration);
// Create the layer instruction and associate it with the composition video track.
AVMutableVideoCompositionLayerInstruction *firstVideoLayerInstruction = [AVMutableVideoCompositionLayerInstruction videoCompositionLayerInstructionWithAssetTrack:mutableCompositionVideoTrack];
// Create the opacity ramp to fade out the first video track over its entire duration.
[firstVideoLayerInstruction setOpacityRampFromStartOpacity:1.f toEndOpacity:0.f timeRange:CMTimeRangeMake(kCMTimeZero, firstVideoAssetTrack.timeRange.duration)];
// Create the second video composition instruction so that the second video track isn't transparent.
AVMutableVideoCompositionInstruction *secondVideoCompositionInstruction = [AVMutableVideoCompositionInstruction videoCompositionInstruction];
// Set its time range to span the duration of the second video track.
secondVideoCompositionInstruction.timeRange = CMTimeRangeMake(firstVideoAssetTrack.timeRange.duration, CMTimeAdd(firstVideoAssetTrack.timeRange.duration, secondVideoAssetTrack.timeRange.duration));
// Create the second layer instruction and associate it with the composition video track.
AVMutableVideoCompositionLayerInstruction *secondVideoLayerInstruction = [AVMutableVideoCompositionLayerInstruction videoCompositionLayerInstructionWithAssetTrack:mutableCompositionVideoTrack];
// Attach the first layer instruction to the first video composition instruction.
firstVideoCompositionInstruction.layerInstructions = @[firstVideoLayerInstruction];
// Attach the second layer instruction to the second video composition instruction.
secondVideoCompositionInstruction.layerInstructions = @[secondVideoLayerInstruction];
// Attach both of the video composition instructions to the video composition.
AVMutableVideoComposition *mutableVideoComposition = [AVMutableVideoComposition videoComposition];
mutableVideoComposition.instructions = @[firstVideoCompositionInstruction, secondVideoCompositionInstruction];

Incorporating Core Animation Effects – 结合核心动画效果

A video composition can add the power of Core Animation to your composition through the animationTool property. Through this animation tool, you can accomplish tasks such as watermarking video and adding titles or animating overlays. Core Animation can be used in two different ways with video compositions: You can add a Core Animation layer as its own individual composition track, or you can render Core Animation effects (using a Core Animation layer) into the video frames in your composition directly. The following code displays the latter option by adding a watermark to the center of the video:

一个视频组件可以通过 animationTool 属性将核心动画的力量添加到你的组件中。通过这个动画制作工具,可以完成一些任务,例如视频水印,添加片头或者动画覆盖。核心动画可以有两种不同的方式被用于视频组件:可以添加一个核心动画层到自己的个人组件轨道,或者可以渲染核心动画效果(使用一个核心动画层)直接进入组件的视频框架。下面的代码展示了在视频中央添加一个水印显示出来的效果。

CALayer *watermarkLayer = <#CALayer representing your desired watermark image#>;
CALayer *parentLayer = [CALayer layer];
CALayer *videoLayer = [CALayer layer];
parentLayer.frame = CGRectMake(0, 0, mutableVideoComposition.renderSize.width, mutableVideoComposition.renderSize.height);
videoLayer.frame = CGRectMake(0, 0, mutableVideoComposition.renderSize.width, mutableVideoComposition.renderSize.height);
[parentLayer addSublayer:videoLayer];
watermarkLayer.position = CGPointMake(mutableVideoComposition.renderSize.width/2, mutableVideoComposition.renderSize.height/4);
[parentLayer addSublayer:watermarkLayer];
mutableVideoComposition.animationTool = [AVVideoCompositionCoreAnimationTool videoCompositionCoreAnimationToolWithPostProcessingAsVideoLayer:videoLayer inLayer:parentLayer];

Putting It All Together: Combining Multiple Assets and Saving the Result to the Camera Roll –

This brief code example illustrates how you can combine two video asset tracks and an audio asset track to create a single video file. It shows how to:

  • Create an AVMutableComposition object and add multiple AVMutableCompositionTrack objects
  • Add time ranges of AVAssetTrack objects to compatible composition tracks
  • Check the preferredTransform property of a video asset track to determine the video’s orientation
  • Use AVMutableVideoCompositionLayerInstruction objects to apply transforms to the video tracks within – a composition
  • Set appropriate values for the renderSize and frameDuration properties of a video composition
  • Use a composition in conjunction with a video composition when exporting to a video file
  • Save a video file to the Camera Roll


Note: To focus on the most relevant code, this example omits several aspects of a complete app, such as memory management and error handling. To use AVFoundation, you are expected to have enough experience with Cocoa to infer the missing pieces.

注意:关注最相关的代码,这个例子省略了一个完整应用程序的几个方面,如内存处理和错误处理。利用 AVFoundation ,希望你有足够的使用 Cocoa 的经验去判断丢失的碎片

Creating the Composition – 创建组件

To piece together tracks from separate assets, you use an AVMutableComposition object. Create the composition and add one audio and one video track.

使用 AVMutableComposition 对象将分离的资产拼凑成轨道。创建组件并且添加一个音频轨道和一个视频轨道。

AVMutableComposition *mutableComposition = [AVMutableComposition composition];
AVMutableCompositionTrack *videoCompositionTrack = [mutableComposition addMutableTrackWithMediaType:AVMediaTypeVideo preferredTrackID:kCMPersistentTrackID_Invalid];
AVMutableCompositionTrack *audioCompositionTrack = [mutableComposition addMutableTrackWithMediaType:AVMediaTypeAudio preferredTrackID:kCMPersistentTrackID_Invalid];

Adding the Assets – 添加资产

An empty composition does you no good. Add the two video asset tracks and the audio asset track to the composition.


AVAssetTrack *firstVideoAssetTrack = [[firstVideoAsset tracksWithMediaType:AVMediaTypeVideo] objectAtIndex:0];
AVAssetTrack *secondVideoAssetTrack = [[secondVideoAsset tracksWithMediaType:AVMediaTypeVideo] objectAtIndex:0];
[videoCompositionTrack insertTimeRange:CMTimeRangeMake(kCMTimeZero, firstVideoAssetTrack.timeRange.duration) ofTrack:firstVideoAssetTrack atTime:kCMTimeZero error:nil];
[videoCompositionTrack insertTimeRange:CMTimeRangeMake(kCMTimeZero, secondVideoAssetTrack.timeRange.duration) ofTrack:secondVideoAssetTrack atTime:firstVideoAssetTrack.timeRange.duration error:nil];
[audioCompositionTrack insertTimeRange:CMTimeRangeMake(kCMTimeZero, CMTimeAdd(firstVideoAssetTrack.timeRange.duration, secondVideoAssetTrack.timeRange.duration)) ofTrack:[[audioAsset tracksWithMediaType:AVMediaTypeAudio] objectAtIndex:0] atTime:kCMTimeZero error:nil];

Note: This assumes that you have two assets that contain at least one video track each and a third asset that contains at least one audio track. The videos can be retrieved from the Camera Roll, and the audio track can be retrieved from the music library or the videos themselves.


Checking the Video Orientations – 检查视频的方向

Once you add your video and audio tracks to the composition, you need to ensure that the orientations of both video tracks are correct. By default, all video tracks are assumed to be in landscape mode. If your video track was taken in portrait mode, the video will not be oriented properly when it is exported. Likewise, if you try to combine a video shot in portrait mode with a video shot in landscape mode, the export session will fail to complete.


BOOL isFirstVideoPortrait = NO;
CGAffineTransform firstTransform = firstVideoAssetTrack.preferredTransform;
// Check the first video track's preferred transform to determine if it was recorded in portrait mode.
if (firstTransform.a == 0 && firstTransform.d == 0 && (firstTransform.b == 1.0 || firstTransform.b == -1.0) && (firstTransform.c == 1.0 || firstTransform.c == -1.0)) {
    isFirstVideoPortrait = YES;
BOOL isSecondVideoPortrait = NO;
CGAffineTransform secondTransform = secondVideoAssetTrack.preferredTransform;
// Check the second video track's preferred transform to determine if it was recorded in portrait mode.
if (secondTransform.a == 0 && secondTransform.d == 0 && (secondTransform.b == 1.0 || secondTransform.b == -1.0) && (secondTransform.c == 1.0 || secondTransform.c == -1.0)) {
    isSecondVideoPortrait = YES;
if ((isFirstVideoAssetPortrait && !isSecondVideoAssetPortrait) || (!isFirstVideoAssetPortrait && isSecondVideoAssetPortrait)) {
    UIAlertView *incompatibleVideoOrientationAlert = [[UIAlertView alloc] initWithTitle:@"Error!" message:@"Cannot combine a video shot in portrait mode with a video shot in landscape mode." delegate:self cancelButtonTitle:@"Dismiss" otherButtonTitles:nil];
    [incompatibleVideoOrientationAlert show];

Applying the Video Composition Layer Instructions – 视频组件层指令的应用

Once you know the video segments have compatible orientations, you can apply the necessary layer instructions to each one and add these layer instructions to the video composition.


AVMutableVideoCompositionInstruction *firstVideoCompositionInstruction = [AVMutableVideoCompositionInstruction videoCompositionInstruction];
// Set the time range of the first instruction to span the duration of the first video track.
firstVideoCompositionInstruction.timeRange = CMTimeRangeMake(kCMTimeZero, firstVideoAssetTrack.timeRange.duration);
AVMutableVideoCompositionInstruction * secondVideoCompositionInstruction = [AVMutableVideoCompositionInstruction videoCompositionInstruction];
// Set the time range of the second instruction to span the duration of the second video track.
secondVideoCompositionInstruction.timeRange = CMTimeRangeMake(firstVideoAssetTrack.timeRange.duration, CMTimeAdd(firstVideoAssetTrack.timeRange.duration, secondVideoAssetTrack.timeRange.duration));
AVMutableVideoCompositionLayerInstruction *firstVideoLayerInstruction = [AVMutableVideoCompositionLayerInstruction videoCompositionLayerInstructionWithAssetTrack:videoCompositionTrack];
// Set the transform of the first layer instruction to the preferred transform of the first video track.
[firstVideoLayerInstruction setTransform:firstTransform atTime:kCMTimeZero];
AVMutableVideoCompositionLayerInstruction *secondVideoLayerInstruction = [AVMutableVideoCompositionLayerInstruction videoCompositionLayerInstructionWithAssetTrack:videoCompositionTrack];
// Set the transform of the second layer instruction to the preferred transform of the second video track.
[secondVideoLayerInstruction setTransform:secondTransform atTime:firstVideoAssetTrack.timeRange.duration];
firstVideoCompositionInstruction.layerInstructions = @[firstVideoLayerInstruction];
secondVideoCompositionInstruction.layerInstructions = @[secondVideoLayerInstruction];
AVMutableVideoComposition *mutableVideoComposition = [AVMutableVideoComposition videoComposition];
mutableVideoComposition.instructions = @[firstVideoCompositionInstruction, secondVideoCompositionInstruction];

All AVAssetTrack objects have a preferredTransform property that contains the orientation information for that asset track. This transform is applied whenever the asset track is displayed onscreen. In the previous code, the layer instruction’s transform is set to the asset track’s transform so that the video in the new composition displays properly once you adjust its render size.

所有的 AVAssetTrack 对象都有一个 preferredTransform 属性,包含了资产轨道的方向信息。当资产轨道被展示到屏幕上时就进行这些转换。在之前的代码中,层指令信息的转换被设置为资产轨道的转换,使得一旦你调整了它的渲染大小,视频在新的组件中都能正确的显示。

Setting the Render Size and Frame Duration – 设置渲染大小和帧周期

To complete the video orientation fix, you must adjust the renderSize property accordingly. You should also pick a suitable value for the frameDuration property, such as 1/30th of a second (or 30 frames per second). By default, the renderScale property is set to 1.0, which is appropriate for this composition.

为了完成视频方向的固定,必须调整相应的 renderSize 属性。也应该给 frameDuration 属性设置一个合适的值,比如 1/30th of a second (或者每秒 30 帧)。默认情况下,renderScale 属性设置 1.0,对于组件是比较合适的。

CGSize naturalSizeFirst, naturalSizeSecond;
// If the first video asset was shot in portrait mode, then so was the second one if we made it here.
if (isFirstVideoAssetPortrait) {
    // Invert the width and height for the video tracks to ensure that they display properly.
    naturalSizeFirst = CGSizeMake(firstVideoAssetTrack.naturalSize.height, firstVideoAssetTrack.naturalSize.width);
    naturalSizeSecond = CGSizeMake(secondVideoAssetTrack.naturalSize.height, secondVideoAssetTrack.naturalSize.width);
else {
    // If the videos weren't shot in portrait mode, we can just use their natural sizes.
    naturalSizeFirst = firstVideoAssetTrack.naturalSize;
    naturalSizeSecond = secondVideoAssetTrack.naturalSize;
float renderWidth, renderHeight;
// Set the renderWidth and renderHeight to the max of the two videos widths and heights.
if (naturalSizeFirst.width > naturalSizeSecond.width) {
    renderWidth = naturalSizeFirst.width;
else {
    renderWidth = naturalSizeSecond.width;
if (naturalSizeFirst.height > naturalSizeSecond.height) {
    renderHeight = naturalSizeFirst.height;
else {
    renderHeight = naturalSizeSecond.height;
mutableVideoComposition.renderSize = CGSizeMake(renderWidth, renderHeight);
// Set the frame duration to an appropriate value (i.e. 30 frames per second for video).
mutableVideoComposition.frameDuration = CMTimeMake(1,30);

Exporting the Composition and Saving it to the Camera Roll – 导出组件并存到相机胶卷

The final step in this process involves exporting the entire composition into a single video file and saving that video to the camera roll. You use an AVAssetExportSession object to create the new video file and you pass to it your desired URL for the output file. You can then use the ALAssetsLibrary class to save the resulting video file to the Camera Roll.

这个过程的最后一步,是将整个组件导出到一个单独的视频文件,并且将视频存到相机胶卷中。使用 AVAssetExportSession 对象去创建新的视频文件,并且给输出文件传递一个期望的 URL 。然后可以使用 ALAssetsLibrary 类去将视频文件结果保存到相机胶卷。

// Create a static date formatter so we only have to initialize it once.
static NSDateFormatter *kDateFormatter;
if (!kDateFormatter) {
    kDateFormatter = [[NSDateFormatter alloc] init];
    kDateFormatter.dateStyle = NSDateFormatterMediumStyle;
    kDateFormatter.timeStyle = NSDateFormatterShortStyle;
// Create the export session with the composition and set the preset to the highest quality.
AVAssetExportSession *exporter = [[AVAssetExportSession alloc] initWithAsset:mutableComposition presetName:AVAssetExportPresetHighestQuality];
// Set the desired output URL for the file created by the export process.
exporter.outputURL = [[[[NSFileManager defaultManager] URLForDirectory:NSDocumentDirectory inDomain:NSUserDomainMask appropriateForURL:nil create:@YES error:nil] URLByAppendingPathComponent:[kDateFormatter stringFromDate:[NSDate date]]] URLByAppendingPathExtension:CFBridgingRelease(UTTypeCopyPreferredTagWithClass((CFStringRef)AVFileTypeQuickTimeMovie, kUTTagClassFilenameExtension))];
// Set the output file type to be a QuickTime movie.
exporter.outputFileType = AVFileTypeQuickTimeMovie;
exporter.shouldOptimizeForNetworkUse = YES;
exporter.videoComposition = mutableVideoComposition;
// Asynchronously export the composition to a video file and save this file to the camera roll once export completes.
[exporter exportAsynchronouslyWithCompletionHandler:^{
    dispatch_async(dispatch_get_main_queue(), ^{
        if (exporter.status == AVAssetExportSessionStatusCompleted) {
            ALAssetsLibrary *assetsLibrary = [[ALAssetsLibrary alloc] init];
            if ([assetsLibrary videoAtPathIsCompatibleWithSavedPhotosAlbum:exporter.outputURL]) {
                [assetsLibrary writeVideoAtPathToSavedPhotosAlbum:exporter.outputURL completionBlock:NULL];

Still and Video Media Capture – 静态视频媒体捕获。

To manage the capture from a device such as a camera or microphone, you assemble objects to represent inputs and outputs, and use an instance of AVCaptureSession to coordinate the data flow between them. Minimally you need:

  • An instance of AVCaptureDevice to represent the input device, such as a camera or microphone
  • An instance of a concrete subclass of AVCaptureInput to configure the ports from the input device
  • An instance of a concrete subclass of AVCaptureOutput to manage the output to a movie file or still image
  • An instance of AVCaptureSession to coordinate the data flow from the input to the output

从一个设备,例如照相机或者麦克风管理捕获,组合对象来表示输入和输出,并使用 AVCaptureSession 的实例来协调它们之间的数据流。你需要最低限度的了解:

To show the user a preview of what the camera is recording, you can use an instance of AVCaptureVideoPreviewLayer (a subclass of CALayer).

You can configure multiple inputs and outputs, coordinated by a single session, as shown in Figure 4-1

为了向用户展示照相机之前记录的预览,可以使用 AVCaptureVideoPreviewLayer 的实例(CALayer 的一个子类)

可以配置多个输入和输出,由一个单独的会话协调。如图 4-1 所示:

AVFoundation Programming Guide(官方文档翻译)完整版中英对照

For many applications, this is as much detail as you need. For some operations, however, (if you want to monitor the power levels in an audio channel, for example) you need to consider how the various ports of an input device are represented and how those ports are connected to the output.


A connection between a capture input and a capture output in a capture session is represented by an AVCaptureConnection object. Capture inputs (instances of AVCaptureInput) have one or more input ports (instances of AVCaptureInputPort). Capture outputs (instances of AVCaptureOutput) can accept data from one or more sources (for example, an AVCaptureMovieFileOutput object accepts both video and audio data).

捕获输入和捕获输出的会话之间的连接表现为 AVCaptureConnection 对象。捕获输入(AVCaptureInput 的实例)有一个或多个输入端口(AVCaptureInputPort 的实例)。捕获输出(AVCaptureOutput 的实例)可以从一个或多个资源接受数据(例如,AVCaptureMovieFileOutput 对象接受音频和视频数据)。

When you add an input or an output to a session, the session forms connections between all the compatible capture inputs’ ports and capture outputs, as shown in Figure 4-2. A connection between a capture input and a capture output is represented by an AVCaptureConnection object.

当给会话添加一个输入或者一个输出时,会话构成了所有可兼容的捕获输入端口和捕获输出端口的连接,如图 4-2 所示。捕获输入与捕获输出之间的连接是由 AVCaptureConnection 对象表示。

AVFoundation Programming Guide(官方文档翻译)完整版中英对照

You can use a capture connection to enable or disable the flow of data from a given input or to a given output. You can also use a connection to monitor the average and peak power levels in an audio channel.


Note: Media capture does not support simultaneous capture of both the front-facing and back-facing cameras on iOS devices.

注意:媒体捕获不支持 iOS 设备上的前置摄像头和后置摄像头的同时捕捉。

Use a Capture Session to Coordinate Data Flow – 使用捕捉会话来协调数据流

An AVCaptureSession object is the central coordinating object you use to manage data capture. You use an instance to coordinate the flow of data from AV input devices to outputs. You add the capture devices and outputs you want to the session, then start data flow by sending the session a startRunning message, and stop the data flow by sending a stopRunning message.

AVCaptureSession 对象是你用来管理数据捕获的中央协调对象。使用一个实例来协调从 AV 输入设备到输出的数据流。添加捕获设备并且输出你想要的会话,然后发送一个 startRunning 消息启动数据流,发送 stopRunning 消息来停止数据流。

AVCaptureSession *session = [[AVCaptureSession alloc] init];
// Add inputs and outputs.
[session startRunning];

Configuring a Session – 配置会话

You use a preset on the session to specify the image quality and resolution you want. A preset is a constant that identifies one of a number of possible configurations; in some cases the actual configuration is device-specific:

使用会话上的 preset 来指定图像的质量和分辨率。预设是一个常数,确定了一部分可能的配置中的一个;在某些情况下,设计的配置是设备特有的:

| Symbol | Resolution | Comments |
| AVCaptureSessionPresetHigh | High | Highest recording quality.This varies per device.|
| AVCaptureSessionPresetMedium | Medium | Suitable for Wi-Fi sharing.The actual values may change.|
| AVCaptureSessionPresetLow | Low | Suitable for 3G sharing.The actual values may change. |
| AVCaptureSessionPreset640x480 | 640×480 | VGA |
| AVCaptureSessionPreset1280x720 | 1280×720 | 720p HD. |
| AVCaptureSessionPresetPhoto | Photo | Full photo resolution.This is not supported for video output. |

If you want to set a media frame size-specific configuration, you should check whether it is supported before setting it, as follows:


if ([session canSetSessionPreset:AVCaptureSessionPreset1280x720]) {
    session.sessionPreset = AVCaptureSessionPreset1280x720;
else {
    // Handle the failure.

If you need to adjust session parameters at a more granular level than is possible with a preset, or you’d like to make changes to a running session, you surround your changes with the beginConfiguration and commitConfiguration methods. The beginConfiguration and commitConfiguration methods ensure that devices changes occur as a group, minimizing visibility or inconsistency of state. After calling beginConfiguration, you can add or remove outputs, alter the sessionPreset property, or configure individual capture input or output properties. No changes are actually made until you invoke commitConfiguration, at which time they are applied together.

如果需要比预设情况,更加精细的水平调整会话参数,或者想给一个正在运行的会话做些改变,用 beginConfigurationcommitConfiguration 方法。beginConfigurationcommitConfiguration 方法确保设备作为一个群体在变化,降低状态的清晰度或者不协调性。调用 beginConfiguration 之后,可以添加或者移除输出,改变 sessionPreset 属性,或者单独配置捕获输入或输出属性。在你调用 commitConfiguration 之前实际上是没有变化的,调用的时候它们才被应用到一起。

[session beginConfiguration];
// Remove an existing capture device.
// Add a new capture device.
// Reset the preset.
[session commitConfiguration];

Monitoring Capture Session State – 监视捕获会话状态

A capture session posts notifications that you can observe to be notified, for example, when it starts or stops running, or when it is interrupted. You can register to receive an AVCaptureSessionRuntimeErrorNotification if a runtime error occurs. You can also interrogate the session’s running property to find out if it is running, and its interrupted property to find out if it is interrupted. Additionally, both the running and interrupted properties are key-value observing compliant and the notifications are posted on the main thread.

捕获会话发出你能观察并被通知到的 notifications,例如,当它开始或者停止运行,或者当它被中断。你可以注册,如果发生了运行阶段的错误,可以接收 AVCaptureSessionRuntimeErrorNotification 。也可以询问会话的 running 属性去发现它正在运行的状态,并且它的 interrupted 属性可以找到它是否被中断了。此外, runninginterrupted 属性是遵从key-value observing ,并且在通知都是在主线程上发布的。

An AVCaptureDevice Object Represents an Input Device – 一个 AVCaptureDevice 对象代表一个输入设备

An AVCaptureDevice object abstracts a physical capture device that provides input data (such as audio or video) to an AVCaptureSession object. There is one object for each input device, for example, two video inputs—one for the front-facing the camera, one for the back-facing camera—and one audio input for the microphone.

一个 AVCaptureDevice 对象抽象出物理捕获设备,提供了输入数据 (比如音频或者视频) 给 AVCaptureSession 对象。例如每个输入设备都有一个对象,两个视频输入,一个用于前置摄像头,一个用于后置摄像头,一个用于麦克风的音频输入。

You can find out which capture devices are currently available using the AVCaptureDevice class methods devices and devicesWithMediaType:. And, if necessary, you can find out what features an iPhone, iPad, or iPod offers (see Device Capture Settings). The list of available devices may change, though. Current input devices may become unavailable (if they’re used by another application), and new input devices may become available, (if they’re relinquished by another application). You should register to receive AVCaptureDeviceWasConnectedNotification and AVCaptureDeviceWasDisconnectedNotification notifications to be alerted when the list of available devices changes.

使用 AVCaptureDevice 类方法 devicesdevicesWithMediaType: 可以找出哪一个捕获设备当前是可用的。而且如果有必要,可以找出 iPhoneiPad 或者 iPod 提供了什么功能(详情看:Device Capture Settings)。虽然可用设备的列表可能会改变。当前输入设备可能会变得不可用(如果他们被另一个应用程序使用),新的输入设备可能成为可用的,(如果他们被另一个应用程序让出)。应该注册,当可用设备列表改变时接收 AVCaptureDeviceWasConnectedNotificationAVCaptureDeviceWasDisconnectedNotification 通知。

You add an input device to a capture session using a capture input (see Use Capture Inputs to Add a Capture Device to a Session).

使用捕获输入将输入设备添加到捕获会话中(详情请看:Use Capture Inputs to Add a Capture Device to a Session

Device Characteristics – 设备特点

You can ask a device about its different characteristics. You can also test whether it provides a particular media type or supports a given capture session preset using hasMediaType: and supportsAVCaptureSessionPreset: respectively. To provide information to the user, you can find out the position of the capture device (whether it is on the front or the back of the unit being tested), and its localized name. This may be useful if you want to present a list of capture devices to allow the user to choose one.

你可以问一个有关设备的不同特征。你也可以使用 hasMediaType: 测试它是否提供了一个特定的媒体类型,或者使用 supportsAVCaptureSessionPreset: 支持一个给定捕捉会话的预设状态。为了给用户提供信息,可以找到捕捉设备的位置(无论它是在正被测试单元的前面还是后面),以及本地化名称。这是很有用的,如果你想提出一个捕获设备的列表,让用户选择一个。

Figure 4-3 shows the positions of the back-facing (AVCaptureDevicePositionBack) and front-facing (AVCaptureDevicePositionFront) cameras.

图 4-3 显示了后置摄像头(AVCaptureDevicePositionBack)和前置摄像头(AVCaptureDevicePositionFront)的位置。

Note: Media capture does not support simultaneous capture of both the front-facing and back-facing cameras on iOS devices.

注意:媒体捕获在 iOS 设备上不支持前置摄像头和后置摄像头同时捕捉。

AVFoundation Programming Guide(官方文档翻译)完整版中英对照

The following code example iterates over all the available devices and logs their name—and for video devices, their position—on the unit.


NSArray *devices = [AVCaptureDevice devices];

for (AVCaptureDevice *device in devices) {

    NSLog(@"Device name: %@", [device localizedName]);

    if ([device hasMediaType:AVMediaTypeVideo]) {

        if ([device position] == AVCaptureDevicePositionBack) {
            NSLog(@"Device position : back");
        else {
            NSLog(@"Device position : front");

In addition, you can find out the device’s model ID and its unique ID.

此外,你可以找到该设备的 model ID 和它的 unique ID

Device Capture Settings

Different devices have different capabilities; for example, some may support different focus or flash modes; some may support focus on a point of interest.


The following code fragment shows how you can find video input devices that have a torch mode and support a given capture session preset:

下面的代码片段展示了如何找到有一个 torch 模式的视频输入设备,并且支持一个捕捉会话预设。

NSArray *devices = [AVCaptureDevice devicesWithMediaType:AVMediaTypeVideo];
NSMutableArray *torchDevices = [[NSMutableArray alloc] init];

for (AVCaptureDevice *device in devices) {
    [if ([device hasTorch] &&
         [device supportsAVCaptureSessionPreset:AVCaptureSessionPreset640x480]) {
        [torchDevices addObject:device];

If you find multiple devices that meet your criteria, you might let the user choose which one they want to use. To display a description of a device to the user, you can use its localizedName property.

如果找到多个设备满足标准,你可能会让用户选择一个他们想使用的。给用户显示一个设备的描述,可以使用它的 localizedName 属性。

You use the various different features in similar ways. There are constants to specify a particular mode, and you can ask a device whether it supports a particular mode. In several cases, you can observe a property to be notified when a feature is changing. In all cases, you should lock the device before changing the mode of a particular feature, as described in Configuring a Device.


Note: Focus point of interest and exposure point of interest are mutually exclusive, as are focus mode and exposure mode.


Focus Modes – 聚焦模式

There are three focus modes:

  • AVCaptureFocusModeLocked: The focal position is fixed.
    This is useful when you want to allow the user to compose a scene then lock the focus.
  • AVCaptureFocusModeAutoFocus: The camera does a single scan focus then reverts to locked.
    This is suitable for a situation where you want to select a particular item on which to focus and then maintain focus on that item even if it is not the center of the scene.
  • AVCaptureFocusModeContinuousAutoFocus: The camera continuously autofocuses as needed.

有 3 个聚焦模式:

You use the isFocusModeSupported: method to determine whether a device supports a given focus mode, then set the mode using the focusMode property.

使用 isFocusModeSupported: 方法来决定设备是否支持给定的聚焦模式,然后使用 focusMode 属性设置模式。

In addition, a device may support a focus point of interest. You test for support using focusPointOfInterestSupported. If it’s supported, you set the focal point using focusPointOfInterest. You pass a CGPoint where {0,0} represents the top left of the picture area, and {1,1} represents the bottom right in landscape mode with the home button on the right—this applies even if the device is in portrait mode.

此外,设备可能支持一个兴趣焦点。使用 focusPointOfInterestSupported 进行支持测试。如果支持,使用 focusPointOfInterest 设置焦点。传一个 CGPoing,横向模式下(就是 home 键在右边)图片的左上角是 {0, 0},右下角是 {1, 1}, – 即使设备是纵向模式也适用。

You can use the adjustingFocus property to determine whether a device is currently focusing. You can observe the property using key-value observing to be notified when a device starts and stops focusing.

你可以使用 adjustingFocus 属性来确定设备是否正在聚焦。当设备开始、停止聚焦时可以使用 key-value observing 观察,接收通知。

If you change the focus mode settings, you can return them to the default configuration as follows:


if ([currentDevice isFocusModeSupported:AVCaptureFocusModeContinuousAutoFocus]) {
    CGPoint autofocusPoint = CGPointMake(0.5f, 0.5f);
    [currentDevice setFocusPointOfInterest:autofocusPoint];
    [currentDevice setFocusMode:AVCaptureFocusModeContinuousAutoFocus];

Exposure Modes – 曝光模式

There are two exposure modes:

  • AVCaptureExposureModeContinuousAutoExposure: The device automatically adjusts the exposure level as needed.
  • AVCaptureExposureModeLocked: The exposure level is fixed at its current level.

You use the isExposureModeSupported: method to determine whether a device supports a given exposure mode, then set the mode using the exposureMode property.


使用 isExposureModeSupported: 方法来确定设备是否支持给定的曝光模式,然后使用 exposureMode 属性设置模式。

In addition, a device may support an exposure point of interest. You test for support using exposurePointOfInterestSupported. If it’s supported, you set the exposure point using exposurePointOfInterest. You pass a CGPoint where {0,0} represents the top left of the picture area, and {1,1} represents the bottom right in landscape mode with the home button on the right—this applies even if the device is in portrait mode.

此外,一个设备支持一个曝光点。使用 exposurePointOfInterestSupported 测试支持。如果支持,使用 exposurePointOfInterest 设置曝光点。传一个 CGPoing,横向模式下(就是 home 键在右边)图片的左上角是 {0, 0},右下角是 {1, 1}, – 即使设备是纵向模式也适用。

You can use the adjustingExposure property to determine whether a device is currently changing its exposure setting. You can observe the property using key-value observing to be notified when a device starts and stops changing its exposure setting.

可以使用 adjustingExposure 属性来确定设备当前是否改变它的聚焦设置。当设备开始、停止聚焦时可以使用 key-value observing 观察,接收通知。

If you change the exposure settings, you can return them to the default configuration as follows:


if ([currentDevice isExposureModeSupported:AVCaptureExposureModeContinuousAutoExposure]) {
    CGPoint exposurePoint = CGPointMake(0.5f, 0.5f);
    [currentDevice setExposurePointOfInterest:exposurePoint];
    [currentDevice setExposureMode:AVCaptureExposureModeContinuousAutoExposure];

Flash Modes – 闪光模式

There are three flash modes:

  • AVCaptureFlashModeOff: The flash will never fire.
  • AVCaptureFlashModeOn: The flash will always fire.
  • AVCaptureFlashModeAuto: The flash will fire dependent on the ambient light conditions.

You use hasFlash to determine whether a device has a flash. If that method returns YES, you then use the isFlashModeSupported: method, passing the desired mode to determine whether a device supports a given flash mode, then set the mode using the flashMode property.

有 3 种闪光模式:

使用 hasFlash 来确定设备是否有闪光灯。如果这个方法返回 YES ,然后使用 isFlashModeSupported: 方法确定设备是否支持给定的闪光模式,然后使用 flashMode 属性设置模式。

Torch Mode – 手电筒模式

In torch mode, the flash is continuously enabled at a low power to illuminate a video capture. There are three torch modes:

  • AVCaptureTorchModeOff: The torch is always off.
  • AVCaptureTorchModeOn: The torch is always on.
  • AVCaptureTorchModeAuto: The torch is automatically switched on and off as needed.

You use hasTorch to determine whether a device has a flash. You use the isTorchModeSupported: method to determine whether a device supports a given flash mode, then set the mode using the torchMode property.

For devices with a torch, the torch only turns on if the device is associated with a running capture session.

在手电筒模式下,闪光灯在一个低功率下一直开启,以照亮对视频捕获。有 3 个手电筒模式:

使用 hasTorch 来确定设备是否有闪光灯。使用 isTorchModeSupported: 方法来确定设备是否支持给定的闪光模式,然后使用 torchMode 属性来设置模式。


Video Stabilization – 视频稳定性

Cinematic video stabilization is available for connections that operate on video, depending on the specific device hardware. Even so, not all source formats and video resolutions are supported.

Enabling cinematic video stabilization may also introduce additional latency into the video capture pipeline. To detect when video stabilization is in use, use the videoStabilizationEnabled property. The enablesVideoStabilizationWhenAvailable property allows an application to automatically enable video stabilization if it is supported by the camera. By default automatic stabilization is disabled due to the above limitations.


使用电影视频稳定化也可能会对视频采集管道引起额外的延迟。正在使用视频稳定化时,使用 videoStabilizationEnabled 属性可以检测。enablesVideoStabilizationWhenAvailable 属性允许应用程序自动使视频稳定化可用,如果它是被摄像头支持的话。由于以上限制,默认自动稳定化是禁用的。

White Balance – 白平衡

There are two white balance modes:

  • AVCaptureWhiteBalanceModeLocked: The white balance mode is fixed.
  • AVCaptureWhiteBalanceModeContinuousAutoWhiteBalance: The camera continuously adjusts the white balance as needed.

You use the isWhiteBalanceModeSupported: method to determine whether a device supports a given white balance mode, then set the mode using the whiteBalanceMode property.

You can use the adjustingWhiteBalance property to determine whether a device is currently changing its white balance setting. You can observe the property using key-value observing to be notified when a device starts and stops changing its white balance setting.


使用 isWhiteBalanceModeSupported: :方法来确定设备是否支持给定的白平衡模式,然后使用 whiteBalanceMode 属性设置模式。

可以使用 adjustingWhiteBalance 属性来确定设备是否正在改变白平衡设置。当设备开始或者停止改变它的白平衡设置时,可以使用 key-value observing 观察属性,接收通知。

Setting Device Orientation – 设置设备方向

You set the desired orientation on a AVCaptureConnection to specify how you want the images oriented in the AVCaptureOutput (AVCaptureMovieFileOutput, AVCaptureStillImageOutput and AVCaptureVideoDataOutput) for the connection.

Use the AVCaptureConnectionsupportsVideoOrientation property to determine whether the device supports changing the orientation of the video, and the videoOrientation property to specify how you want the images oriented in the output port. Listing 4-1 shows how to set the orientation for a AVCaptureConnection to AVCaptureVideoOrientationLandscapeLeft:

AVCaptureConnection 设置期望的方向,来指定你想要的图像在 AVCaptureOutputAVCaptureMovieFileOutputAVCaptureStillImageOutput, AVCaptureVideoDataOutput)中的方向,为了连接。

使用 AVCaptureConnectionsupportsVideoOrientation 属性来确定设备是否支持改变视频的方向,videoOrientation 属性指定你想要的图像在输出端口的方向。列表 4-1 显示了如何设置方向,为 AVCaptureConnection 设置 AVCaptureVideoOrientationLandscapeLeft

Listing 4-1 Setting the orientation of a capture connection

AVCaptureConnection *captureConnection = <#A capture connection#>;
if ([captureConnection isVideoOrientationSupported])
    AVCaptureVideoOrientation orientation = AVCaptureVideoOrientationLandscapeLeft;
    [captureConnection setVideoOrientation:orientation];

Configuring a Device – 配置设备

To set capture properties on a device, you must first acquire a lock on the device using lockForConfiguration:. This avoids making changes that may be incompatible with settings in other applications. The following code fragment illustrates how to approach changing the focus mode on a device by first determining whether the mode is supported, then attempting to lock the device for reconfiguration. The focus mode is changed only if the lock is obtained, and the lock is released immediately afterward.

在设备上设置捕获属性,必须先使用 lockForConfiguration: 获得设备锁。这样就避免了在其他应用程序中可能与设置不兼容的更改。下面的代码段演示了首先如何通过确定模式是否被支持的方式改变一个设备上的焦点模式,然后视图锁定设备重新配置。只有当锁被获取到,焦点模式才会被改变,并且锁被释放后立即锁定。

if ([device isFocusModeSupported:AVCaptureFocusModeLocked]) {
    NSError *error = nil;
    if ([device lockForConfiguration:&error]) {
        device.focusMode = AVCaptureFocusModeLocked;
        [device unlockForConfiguration];
    else {
        // Respond to the failure as appropriate.

You should hold the device lock only if you need the settable device properties to remain unchanged. Holding the device lock unnecessarily may degrade capture quality in other applications sharing the device.


Switching Between Devices – 切换装置

Sometimes you may want to allow users to switch between input devices—for example, switching from using the front-facing to to the back-facing camera. To avoid pauses or stuttering, you can reconfigure a session while it is running, however you should use beginConfiguration and commitConfiguration to bracket your configuration changes:

有时,你可能想允许用户在输入设备之间进行切换,比如使用前置摄像头到后置摄像头的切换。为了避免暂停或者卡顿,可以在运行时配置一个会话,但是你应该使用 beginConfigurationcommitConfiguration 支持你的配置改变:

AVCaptureSession *session = <#A capture session#>;
[session beginConfiguration];

[session removeInput:frontFacingCameraDeviceInput];
[session addInput:backFacingCameraDeviceInput];

[session commitConfiguration];

When the outermost commitConfiguration is invoked, all the changes are made together. This ensures a smooth transition.

当最外面的 commitConfiguration 被调用,所有的改变都是一起做的。这保证了平稳过渡。

Use Capture Inputs to Add a Capture Device to a Session – 使用捕获输入将捕获设备添加到会话中

To add a capture device to a capture session, you use an instance of AVCaptureDeviceInput (a concrete subclass of the abstract AVCaptureInput class). The capture device input manages the device’s ports.

添加一个捕获装置到捕获会话中,使用 AVCaptureDeviceInput (AVCaptureInput 抽象类的具体子类) 的实例。捕获设备输入管理设备的端口。

NSError *error;
AVCaptureDeviceInput *input =
        [AVCaptureDeviceInput deviceInputWithDevice:device error:&error];
if (!input) {
    // Handle the error appropriately.

You add inputs to a session using addInput:. If appropriate, you can check whether a capture input is compatible with an existing session using canAddInput:.

使用 addInput: 给会话添加一个输入。如果合适的话,可以使用 canAddInput: 检查是否有输入捕获与现有会话是兼容的。

AVCaptureSession *captureSession = <#Get a capture session#>;
AVCaptureDeviceInput *captureDeviceInput = <#Get a capture device input#>;
if ([captureSession canAddInput:captureDeviceInput]) {
    [captureSession addInput:captureDeviceInput];
else {
    // Handle the failure.

See Configuring a Session for more details on how you might reconfigure a running session.

An AVCaptureInput vends one or more streams of media data. For example, input devices can provide both audio and video data. Each media stream provided by an input is represented by an AVCaptureInputPort object. A capture session uses an AVCaptureConnection object to define the mapping between a set of AVCaptureInputPort objects and a single AVCaptureOutput.

有关如果配置一个正在运行的会话,更多细节请查看 Configuring a Session .

AVCaptureInput 声明一个或者多个媒体数据流。例如,输入设备可以提供音频和视频数据。输入提供的每个媒体流都被一个 AVCaptureInputPort 所表示。一个捕获会话使用 AVCaptureConnection 对象来定义一个 一组 AVCaptureInputPort 对象和一个 AVCaptureOutput 之间的映射。

Use Capture Outputs to Get Output from a Session – 使用捕获输出从会话得到输出

To get output from a capture session, you add one or more outputs. An output is an instance of a concrete subclass of AVCaptureOutput. You use:

  • AVCaptureMovieFileOutput to output to a movie file
  • AVCaptureVideoDataOutput if you want to process frames from the video being captured, for example, – to create your own custom view layer
  • AVCaptureAudioDataOutput if you want to process the audio data being captured
  • AVCaptureStillImageOutput if you want to capture still images with accompanying metadata

You add outputs to a capture session using addOutput:. You check whether a capture output is compatible with an existing session using canAddOutput:. You can add and remove outputs as required while the session is running.

要从捕获会话得到输出,可以添加一个或多个输出。一个输出是 AVCaptureOutput 的具体子类的实例。下面几种使用:

使用 addOutput: 把输出添加到捕获会话中。使用 canAddOutput: 检查是否一个捕获输出与现有的会话是兼容的。可以在会话正在运行的时候添加和删除所需的输出。

AVCaptureSession *captureSession = <#Get a capture session#>;
AVCaptureMovieFileOutput *movieOutput = <#Create and configure a movie output#>;
if ([captureSession canAddOutput:movieOutput]) {
    [captureSession addOutput:movieOutput];
else {
    // Handle the failure.

Saving to a Movie File – 保存电影文件

You save movie data to a file using an AVCaptureMovieFileOutput object. (AVCaptureMovieFileOutput is a concrete subclass of AVCaptureFileOutput, which defines much of the basic behavior.) You can configure various aspects of the movie file output, such as the maximum duration of a recording, or its maximum file size. You can also prohibit recording if there is less than a given amount of disk space left.

使用 AVCaptureMovieFileOutput 对象保存电影数据到文件中。(AVCaptureMovieFileOutputAVCaptureFileOutput 的具体子类,定义了大量的基本行为。)可以电影文件输出的各个方面,如记录的最大时间,或它的最大文件的大小。也可以禁止记录,如果有小于给定磁盘空间的数量。

AVCaptureMovieFileOutput *aMovieFileOutput = [[AVCaptureMovieFileOutput alloc] init];
CMTime maxDuration = <#Create a CMTime to represent the maximum duration#>;
aMovieFileOutput.maxRecordedDuration = maxDuration;
aMovieFileOutput.minFreeDiskSpaceLimit = <#An appropriate minimum given the quality of the movie format and the duration#>;

The resolution and bit rate for the output depend on the capture session’s sessionPreset. The video encoding is typically H.264 and audio encoding is typically AAC. The actual values vary by device.

输出的分辨率和比特率取决于捕获会话的 sessionPreset 。视频编码通常是 H.264 ,音频编码通常是 AAC 。实际值因设备而异。

Starting a Recording – 开始记录

You start recording a QuickTime movie using startRecordingToOutputFileURL:recordingDelegate:. You need to supply a file-based URL and a delegate. The URL must not identify an existing file, because the movie file output does not overwrite existing resources. You must also have permission to write to the specified location. The delegate must conform to the AVCaptureFileOutputRecordingDelegate protocol, and must implement the captureOutput:didFinishRecordingToOutputFileAtURL:fromConnections:error: method.

使用 startRecordingToOutputFileURL:recordingDelegate: 开始记录一个 QuickTime 电影。需要提供一个基于 URLdelegate 的文件。URL 决不能指向一个已经存在的文件,因为电影文件输出不会覆盖存在的资源。你还必须有权限能写入指定的位置。 delegate 必须符合 AVCaptureFileOutputRecordingDelegate 协议,并且必须实现 captureOutput:didFinishRecordingToOutputFileAtURL:fromConnections:error: 方法。

AVCaptureMovieFileOutput *aMovieFileOutput = <#Get a movie file output#>;
NSURL *fileURL = <#A file URL that identifies the output location#>;
[aMovieFileOutput startRecordingToOutputFileURL:fileURL recordingDelegate:<#The delegate#>];

In the implementation of captureOutput:didFinishRecordingToOutputFileAtURL:fromConnections:error:, the delegate might write the resulting movie to the Camera Roll album. It should also check for any errors that might have occurred.

captureOutput:didFinishRecordingToOutputFileAtURL:fromConnections:error: 的实现中,代理可以将结果电影写入到相机胶卷专辑中。它也应该可能发生的任何错误。

Ensuring That the File Was Written Successfully – 确保文件被成功写入

To determine whether the file was saved successfully, in the implementation of captureOutput:didFinishRecordingToOutputFileAtURL:fromConnections:error: you check not only the error but also the value of the AVErrorRecordingSuccessfullyFinishedKey in the error’s user info dictionary:

为了确定文件是否成功被写入,在 captureOutput:didFinishRecordingToOutputFileAtURL:fromConnections:error: 的实现中,不仅要检查错误,还要在错误的用户信息字典中,检查 AVErrorRecordingSuccessfullyFinishedKey 的值。

- (void)captureOutput:(AVCaptureFileOutput *)captureOutput
    didFinishRecordingToOutputFileAtURL:(NSURL *)outputFileURL
        fromConnections:(NSArray *)connections
            error:(NSError *)error {

                BOOL recordedSuccessfully = YES;
                if ([error code] != noErr) {
                    // A problem occurred: Find out if the recording was successful.
                    id value = [[error userInfo] objectForKey:AVErrorRecordingSuccessfullyFinishedKey];
                    if (value) {
                        recordedSuccessfully = [value boolValue];
                // Continue as appropriate...

You should check the value of the AVErrorRecordingSuccessfullyFinishedKeykey in the user info dictionary of the error, because the file might have been saved successfully, even though you got an error. The error might indicate that one of your recording constraints was reached—for example, AVErrorMaximumDurationReached or AVErrorMaximumFileSizeReached. Other reasons the recording might stop are:

The disk is full—AVErrorDiskFull
The recording device was disconnected—AVErrorDeviceWasDisconnected
The session was interrupted (for example, a phone call was received)—AVErrorSessionWasInterrupted

应该在用户的错误信息字典中检查 AVErrorRecordingSuccessfullyFinishedKeykey 的值,因为即使得到了一个错误信息,文件可能已经被成功保存了。这种错误可能表明你的一个记录约束被延迟了,例如 AVErrorMaximumDurationReached 或者 AVErrorMaximumFileSizeReached 。记录可能停止的其他原因是:

Adding Metadata to a File – 将元数据添加到文件中

You can set metadata for the movie file at any time, even while recording. This is useful for situations where the information is not available when the recording starts, as may be the case with location information. Metadata for a file output is represented by an array of AVMetadataItem objects; you use an instance of its mutable subclass, AVMutableMetadataItem, to create metadata of your own.

可以在任何时间设置电影文件的元数据,即使在记录的时候。这是有用的,当记录开始,信息室不可用的,因为可能是位置信息的情况下。一个输出文件的元数据是由 AVMetadataItem 对象的数组表示;使用其可变子类 (AVMutableMetadataItem) 的实例,去创建属于你自己的元数据。

AVCaptureMovieFileOutput *aMovieFileOutput = <#Get a movie file output#>;
NSArray *existingMetadataArray = aMovieFileOutput.metadata;
NSMutableArray *newMetadataArray = nil;
if (existingMetadataArray) {
    newMetadataArray = [existingMetadataArray mutableCopy];
else {
    newMetadataArray = [[NSMutableArray alloc] init];

AVMutableMetadataItem *item = [[AVMutableMetadataItem alloc] init];
item.keySpace = AVMetadataKeySpaceCommon;
item.key = AVMetadataCommonKeyLocation;

CLLocation *location - <#The location to set#>;
item.value = [NSString stringWithFormat:@"%+08.4lf%+09.4lf/"
              location.coordinate.latitude, location.coordinate.longitude];

[newMetadataArray addObject:item];

aMovieFileOutput.metadata = newMetadataArray;

Processing Frames of Video – 处理视频的帧

An AVCaptureVideoDataOutput object uses delegation to vend video frames. You set the delegate using setSampleBufferDelegate:queue:. In addition to setting the delegate, you specify a serial queue on which they delegate methods are invoked. You must use a serial queue to ensure that frames are delivered to the delegate in the proper order. You can use the queue to modify the priority given to delivering and processing the video frames. See SquareCam for a sample implementation.

一个 AVCaptureVideoDataOutput 对象使用委托来声明视频帧。使用 setSampleBufferDelegate:queue: 设置代理。除了设置代理,还要制定一个调用它们代理方法的串行队列。必须使用一个串行队列以确保帧以适当的顺序传递给代理。可以使用队列来修改给定传输的优先级和处理视频帧的优先级。查看 SquareCam 有一个简单的实现。

The frames are presented in the delegate method, captureOutput:didOutputSampleBuffer:fromConnection:, as instances of the CMSampleBufferRef opaque type (see Representations of Media). By default, the buffers are emitted in the camera’s most efficient format. You can use the videoSettings property to specify a custom output format. The video settings property is a dictionary; currently, the only supported key is kCVPixelBufferPixelFormatTypeKey. The recommended pixel formats are returned by the availableVideoCVPixelFormatTypes property , and the availableVideoCodecTypes property returns the supported values. Both Core Graphics and OpenGL work well with the BGRA format:

在代理方法中(captureOutput:didOutputSampleBuffer:fromConnection:CMSampleBufferRef 不透明类型的实例,详情见 Representations of Media),帧是被露出来的。默认情况下,被放出的缓冲区是相机最有效的格式。可以使用 videoSettings 属性指定自定义输出格式。视频设置属性是一个字典;目前,唯一支持的 keykCVPixelBufferPixelFormatTypeKey。推荐的像素格式是由 availableVideoCVPixelFormatTypes 属性返回的,并且 availableVideoCodecTypes 属性返回支持的值。Core GraphicsOpenGL 都很好的使用 BGRA 格式:

AVCaptureVideoDataOutput *videoDataOutput = [AVCaptureVideoDataOutput new];
NSDictionary *newSettings =
    @{ (NSString *)kCVPixelBufferPixelFormatTypeKey : @(kCVPixelFormatType_32BGRA) };
videoDataOutput.videoSettings = newSettings;

// discard if the data output queue is blocked (as we process the still image
[videoDataOutput setAlwaysDiscardsLateVideoFrames:YES];)

    // create a serial dispatch queue used for the sample buffer delegate as well as when a still image is captured
    // a serial dispatch queue must be used to guarantee that video frames will be delivered in order
    // see the header doc for setSampleBufferDelegate:queue: for more information
    videoDataOutputQueue = dispatch_queue_create("VideoDataOutputQueue", DISPATCH_QUEUE_SERIAL);
[videoDataOutput setSampleBufferDelegate:self queue:videoDataOutputQueue];

AVCaptureSession *captureSession = <#The Capture Session#>;

if ( [captureSession canAddOutput:videoDataOutput] )
    [captureSession addOutput:videoDataOutput];

Performance Considerations for Processing Video – 处理视频的性能考虑

You should set the session output to the lowest practical resolution for your application. Setting the output to a higher resolution than necessary wastes processing cycles and needlessly consumes power.


You must ensure that your implementation of captureOutput:didOutputSampleBuffer:fromConnection: is able to process a sample buffer within the amount of time allotted to a frame. If it takes too long and you hold onto the video frames, AV Foundation stops delivering frames, not only to your delegate but also to other outputs such as a preview layer.

必须确保 captureOutput:didOutputSampleBuffer:fromConnection: 的实现,能够处理大量时间内的样品缓冲,分配到一个帧中。如果它需要很久,你要一直抓住视频帧,AV Foundation 会停止给,你的代理,还有其他输出例如 preview layer ,提供帧。

You can use the capture video data output’s minFrameDuration property to be sure you have enough time to process a frame — at the cost of having a lower frame rate than would otherwise be the case. You might also make sure that the alwaysDiscardsLateVideoFrames property is set to YES (the default). This ensures that any late video frames are dropped rather than handed to you for processing. Alternatively, if you are recording and it doesn’t matter if the output fames are a little late and you would prefer to get all of them, you can set the property value to NO. This does not mean that frames will not be dropped (that is, frames may still be dropped), but that they may not be dropped as early, or as efficiently.

可以使用捕获视频数据输出的 minFrameDuration 属性来确保你有足够时间来处理帧 – 在具有较低的帧速率比其他情况下的成本。也可以确保 alwaysDiscardsLateVideoFrames 属性被设为 YES (默认)。这确保任何后期视频的帧都被丢弃,而不是交给你处理。或者,如果你是记录,更想得到它们全部,不介意输出帧稍微晚一点的话,可以设置该属性的值为 NO 。这并不意味着不会丢失帧(即,帧仍有可能丢失),但它们不可能像之前那样减少,或者说是有点效果的。

Capturing Still Images – 捕获静止图像

You use an AVCaptureStillImageOutput output if you want to capture still images with accompanying metadata. The resolution of the image depends on the preset for the session, as well as the device.

如果你想捕获带着元数据的静止图像,可以使用 AVCaptureStillImageOutput 输出。图像的分辨率取决于会话的预设,以及设备的设置。

Pixel and Encoding Formats – 像素和编码格式

Different devices support different image formats. You can find out what pixel and codec types are supported by a device using availableImageDataCVPixelFormatTypes and availableImageDataCodecTypes respectively. Each method returns an array of the supported values for the specific device. You set the outputSettings dictionary to specify the image format you want, for example:

不同的设备支持不同的图像格式。使用 availableImageDataCVPixelFormatTypes 可以找到什么样的像素被支持,使用 availableImageDataCodecTypes 可以找到什么样的编解码器类型被支持。每一种方法都返回一个特定设备的支持的值的数组。设置 outputSettings 字典来指定你想要的图像格式,例如:

AVCaptureStillImageOutput *stillImageOutput = [[AVCaptureStillImageOutput alloc] init];
NSDictionary *outputSettings = @{ AVVideoCodecKey : AVVideoCodecJPEG};
[stillImageOutput setOutputSettings:outputSettings];

If you want to capture a JPEG image, you should typically not specify your own compression format. Instead, you should let the still image output do the compression for you, since its compression is hardware-accelerated. If you need a data representation of the image, you can use jpegStillImageNSDataRepresentation: to get an NSData object without recompressing the data, even if you modify the image’s metadata.

如果你想捕获一个 JPEG 图像,通常应该不要指定自己的压缩格式。相反,应该让静态图像输出为你做压缩,因为它的压缩是硬件加速的。如果你需要图像的表示数据,可以使用 jpegStillImageNSDataRepresentation: 得到未压缩数据的NSDate 对象,即使你修改修改图像的元数据。

Capturing an Image – 捕获图像

When you want to capture an image, you send the output a captureStillImageAsynchronouslyFromConnection:completionHandler: message. The first argument is the connection you want to use for the capture. You need to look for the connection whose input port is collecting video:

当你想捕获图像,给输出发送一个 captureStillImageAsynchronouslyFromConnection:completionHandler: 消息。第一个参数是用于想要捕获使用的连接。你需要寻找输入端口是收集视频的连接。

AVCaptureConnection *videoConnection = nil;
for (AVCaptureConnection *connection in stillImageOutput.connections) {
    for (AVCaptureInputPort *port in [connection inputPorts]) {
        if ([[port mediaType] isEqual:AVMediaTypeVideo] ) {
            videoConnection = connection;
    if (videoConnection) { break; }

The second argument to captureStillImageAsynchronouslyFromConnection:completionHandler: is a block that takes two arguments: a CMSampleBuffer opaque type containing the image data, and an error. The sample buffer itself may contain metadata, such as an EXIF dictionary, as an attachment. You can modify the attachments if you want, but note the optimization for JPEG images discussed in Pixel and Encoding Formats.

captureStillImageAsynchronouslyFromConnection:completionHandler: 的第二个参数是一个 blockblock 有两个参数:一个包含图像数据的 CMSampleBuffer 不透明类型,一个 error。样品缓冲自身可能包含元数据,例如 EXIF 字典作为附件。如果你想的话,可以修改附件,但是注意 JPEG 图像进行像素和编码格式的优化。

[stillImageOutput captureStillImageAsynchronouslyFromConnection:videoConnection completionHandler:
 ^(CMSampleBufferRef imageSampleBuffer, NSError *error) {
     CFDictionaryRef exifAttachments =
         CMGetAttachment(imageSampleBuffer, kCGImagePropertyExifDictionary, NULL);
     if (exifAttachments) {
         // Do something with the attachments.
     // Continue as appropriate.

Showing the User What’s Being Recorded – 显示用户正在被记录什么

You can provide the user with a preview of what’s being recorded by the camera (using a preview layer) or by the microphone (by monitoring the audio channel).

可以为用户提供一个预览,关于正在被相机 (使用 perview layer)记录什么,或者被麦克风 (通过监控音频信道) 记录什么。

Video Preview – 视频预览

You can provide the user with a preview of what’s being recorded using an AVCaptureVideoPreviewLayer object. AVCaptureVideoPreviewLayer is a subclass ofCALayer (see Core Animation Programming Guide. You don’t need any outputs to show the preview.

使用 对象可以给用户提供一个正在被记录的预览。 AVCaptureVideoPreviewLayerCALayer 的子类。(详情见 Core Animation Programming Guide),不需要任何输出去显示预览。

Using the AVCaptureVideoDataOutput class provides the client application with the ability to access the video pixels before they are presented to the user.

使用 AVCaptureVideoDataOutput 类提供的访问视频像素才呈现给用户的客户端应用程序的能力。

Unlike a capture output, a video preview layer maintains a strong reference to the session with which it is associated. This is to ensure that the session is not deallocated while the layer is attempting to display video. This is reflected in the way you initialize a preview layer:

与捕获输出不同的是,视频预览层与它关联的会话有一个强引用。这是为了确保会话还没有被释放,layer 就尝试去显示视频。这反映在,你初始化一个预览层的方式上:

AVCaptureSession *captureSession = <#Get a capture session#>;
CALayer *viewLayer = <#Get a layer from the view in which you want to present the preview#>;

AVCaptureVideoPreviewLayer *captureVideoPreviewLayer = [[AVCaptureVideoPreviewLayer alloc] initWithSession:captureSession];
[viewLayer addSublayer:captureVideoPreviewLayer];

In general, the preview layer behaves like any other CALayer object in the render tree (see Core Animation Programming Guide). You can scale the image and perform transformations, rotations, and so on just as you would any layer. One difference is that you may need to set the layer’s orientation property to specify how it should rotate images coming from the camera. In addition, you can test for device support for video mirroring by querying the supportsVideoMirroring property. You can set the videoMirrored property as required, although when the automaticallyAdjustsVideoMirroring property is set to YES (the default), the mirroring value is automatically set based on the configuration of the session.

在一般情况下,预览层行为就像渲染树中任何其他 CALayer 对象(见 Core Animation Programming Guide)。可以缩放图像和执行转换、旋转等,就像你可以在任何层。一个不同点是,你可能需要设置层的 orientation 属性来指定它应该如何从相机中旋转图像。此外,可以通过查询 supportsVideoMirroring 属性来测试设备对于视频镜像的支持。可以根据需要设置 videoMirrored 属性,虽然当 automaticallyAdjustsVideoMirroring 属性被设置为 YES (默认情况下), mirroring 值是自动的基于会话配置进行设置。

Video Gravity Modes – 视屏重力模式

The preview layer supports three gravity modes that you set using videoGravity:

  • AVLayerVideoGravityResizeAspect: This preserves the aspect ratio, leaving black bars where the – video does not fill the available screen area.
  • AVLayerVideoGravityResizeAspectFill: This preserves the aspect ratio, but fills the available – screen area, cropping the video when necessary.
  • AVLayerVideoGravityResize: This simply stretches the video to fill the available screen area, even if doing so distorts the image.

预览层支持 3 种重力模式,使用 videoGravity 设置:

Using “Tap to Focus” with a Preview – 使用 “点击焦点” 预览

You need to take care when implementing tap-to-focus in conjunction with a preview layer. You must account for the preview orientation and gravity of the layer, and for the possibility that the preview may be mirrored. See the sample code project AVCam-iOS: Using AVFoundation to Capture Images and Movies for an implementation of this functionality.

需要注意的是,在实现点击时要注意结合预览层。必须考虑到该层的预览方向和重力,并考虑预览变为镜像显示的可能性。请看示例代码项目:AVCam-iOS: Using AVFoundation to Capture Images and Movies,有关这个功能的实现。

Showing Audio Levels – 显示音频等级

To monitor the average and peak power levels in an audio channel in a capture connection, you use an AVCaptureAudioChannel object. Audio levels are not key-value observable, so you must poll for updated levels as often as you want to update your user interface (for example, 10 times a second).

在捕获连接中检测音频信道的平均值和峰值功率水平,可以使用 AVCaptureAudioChannel 对象。音频等级不是 key-value 可观察的,所以当你想更新你的用户界面(比如 10 秒一次),必须调查最新的等级。

AVCaptureAudioDataOutput *audioDataOutput = <#Get the audio data output#>;
NSArray *connections = audioDataOutput.connections;
if ([connections count] > 0) {
    // There should be only one connection to an AVCaptureAudioDataOutput.
    AVCaptureConnection *connection = [connections objectAtIndex:0];

    NSArray *audioChannels = connection.audioChannels;

    for (AVCaptureAudioChannel *channel in audioChannels) {
        float avg = channel.averagePowerLevel;
        float peak = channel.peakHoldLevel;
        // Update the level meter user interface.

Putting It All Together: Capturing Video Frames as UIImage Objects – 总而言之:捕获视频帧用作 UIImage 对象

This brief code example to illustrates how you can capture video and convert the frames you get to UIImage objects. It shows you how to:

  • Create an AVCaptureSession object to coordinate the flow of data from an AV input device to an – output
  • Find the AVCaptureDevice object for the input type you want
  • Create an AVCaptureDeviceInput object for the device
  • Create an AVCaptureVideoDataOutput object to produce video frames
  • Implement a delegate for the AVCaptureVideoDataOutput object to process video frames
  • Implement a function to convert the CMSampleBuffer received by the delegate into a UIImage object

这个简短的代码示例演示了如何捕捉视频和将帧转化为 UIImage 对象,下面说明方法:

Note: To focus on the most relevant code, this example omits several aspects of a complete application, including memory management. To use AV Foundation, you are expected to have enough experience with Cocoa to be able to infer the missing pieces.

注意:关注最相关的代码,这个例子省略了一个完成程序的几部分,包括内存管理。为了使用 AV Foundation,你应该有足够的 Cocoa 经验,有能力推断出丢失的碎片。

Create and Configure a Capture Session – 创建和配置捕获会话

You use an AVCaptureSession object to coordinate the flow of data from an AV input device to an output. Create a session, and configure it to produce medium-resolution video frames.

使用 AVCaptureSession 对象去协调从 AV 输入设备到输出的数据流。创建一个会话,并将其配置产生中等分辨率的视频帧。

AVCaptureSession *session=[[AVCaptureSession alloc] init];
session.sessionPreset = AVCaptureSessionPresetMedium;

Create and Configure the Device and Device Input – 创建和配置设备记忆设备输入

Capture devices are represented by AVCaptureDevice objects; the class provides methods to retrieve an object for the input type you want. A device has one or more ports, configured using an AVCaptureInput object. Typically, you use the capture input in its default configuration.

Find a video capture device, then create a device input with the device and add it to the session. If an appropriate device can not be located, then the deviceInputWithDevice:error: method will return an error by reference.

AVCaptureDevice 对象表示捕获设备;类提供你想要的输入类型对象的方法。一个设备具有一个或者多个端口,使用 AVCaptureInput 对象配置。通常情况下,在它的默认配置中使用捕获输入。

找到一个视频捕获设备,然后创建一个带着设备的设备输入,并将其添加到会话中,如果合适的设备无法定位,然后 deviceInputWithDevice:error: 方法将会通过引用返回一个错误。

AVCaptureDevice *device =
    [AVCaptureDevice defaultDeviceWithMediaType:AVMediaTypeVideo];

NSError *error = nil;
AVCaptureDeviceInput *input =
    [AVCaptureDeviceInput deviceInputWithDevice:device error:&error];
if (!input) {
    // Handle the error appropriately.
[session addInput:input];

Create and Configure the Video Data Output – 创建和配置视频数据输出

You use an AVCaptureVideoDataOutput object to process uncompressed frames from the video being captured. You typically configure several aspects of an output. For video, for example, you can specify the pixel format using the videoSettings property and cap the frame rate by setting the minFrameDuration property.

Create and configure an output for video data and add it to the session; cap the frame rate to 15 fps by setting the minFrameDuration property to 1/15 second:

使用 AVCaptureVideoDataOutput 对象去处理视频捕获过程中未被压缩的帧。通常配置输出的几个方面。例如视频,可以使用 videoSettings 属性指定像素格式,通过设置 minFrameDuration 属性覆盖帧速率。

为视频数据创建和配置输出,并将其添加到会话中;通过设置 minFrameDuration 属性为每秒 1/15,将帧速率覆盖为 15 fps

AVCaptureVideoDataOutput *output = [[AVCaptureVideoDataOutput alloc] init];
[session addOutput:output];
output.videoSettings =
    @{ (NSString *)kCVPixelBufferPixelFormatTypeKey : @(kCVPixelFormatType_32BGRA) };
output.minFrameDuration = CMTimeMake(1, 15);

The data output object uses delegation to vend the video frames. The delegate must adopt the AVCaptureVideoDataOutputSampleBufferDelegate protocol. When you set the data output’s delegate, you must also provide a queue on which callbacks should be invoked.

数据输出对象使用委托来声明一个视频帧。代理必须 AVCaptureVideoDataOutputSampleBufferDelegate 协议。当你设置了数据输出的代理,还必须提供一个回调时应该被调用的队列。

dispatch_queue_t queue = dispatch_queue_create("MyQueue", NULL);
[output setSampleBufferDelegate:self queue:queue];

You use the queue to modify the priority given to delivering and processing the video frames.


Implement the Sample Buffer Delegate Method – 实现示例缓冲代理方法

In the delegate class, implement the method (captureOutput:didOutputSampleBuffer:fromConnection:) that is called when a sample buffer is written. The video data output object delivers frames as CMSampleBuffer opaque types, so you need to convert from the CMSampleBuffer opaque type to a UIImage object. The function for this operation is shown in Converting CMSampleBuffer to a UIImage Object.

在代理类,实现方法(captureOutput:didOutputSampleBuffer:fromConnection:),当样本缓冲写入时被调用。视频数据输出对象传递了 CMSampleBuffer 不透明类型的帧,所以你需要从 CMSampleBuffer 不透明类型转化为一个 UIImage 对象。这个操作的功能在 Converting CMSampleBuffer to a UIImage Object 中展示。

- (void)captureOutput:(AVCaptureOutput *)captureOutput
        fromConnection:(AVCaptureConnection *)connection {

            UIImage *image = imageFromSampleBuffer(sampleBuffer);
            // Add your code here that uses the image.

Remember that the delegate method is invoked on the queue you specified in setSampleBufferDelegate:queue:; if you want to update the user interface, you must invoke any relevant code on the main thread.

记住,代理方法是在 setSampleBufferDelegate:queue: 中你指定的队列中调用;如果你想要更新用户界面,必须在主线程上调用任何相关代码。

Starting and Stopping Recording – 启动和停止录制

After configuring the capture session, you should ensure that the camera has permission to record according to the user’s preferences.


NSString *mediaType = AVMediaTypeVideo;

[AVCaptureDevice requestAccessForMediaType:mediaType completionHandler:^(BOOL granted) {
    if (granted)
        //Granted access to mediaType
        [self setDeviceAuthorized:YES];
        //Not granted access to mediaType
        dispatch_async(dispatch_get_main_queue(), ^{
            [[[UIAlertView alloc] initWithTitle:@"AVCam!"
              message:@"AVCam doesn't have permission to use Camera, please change privacy settings"
              otherButtonTitles:nil] show];
            [self setDeviceAuthorized:NO];

If the camera session is configured and the user has approved access to the camera (and if required, the microphone), send a startRunning message to start the recording.

如果相机会话被配置,用户批准访问摄像头(如果需要,麦克风),发送 startRunning 消息开始录制。

Important: The startRunning method is a blocking call which can take some time, therefore you should perform session setup on a serial queue so that the main queue isn’t blocked (which keeps the UI responsive). See AVCam-iOS: Using AVFoundation to Capture Images and Movies for the canonical implementation example.

重点:startRunning 方法正在阻塞调用时,可能需要一些时间,因此你应该在串行队列执行会话建立,为了主队列不被堵塞(使UI相应)。见 AVCam-iOS: Using AVFoundation to Capture Images and Movies ,典型实现的例子。

[session startRunning];

To stop recording, you send the session a stopRunning message.

要停止录制,给会话发送一个 stopRunning 消息。

High Frame Rate Video Capture – 高帧速率视频捕获

iOS 7.0 introduces high frame rate video capture support (also referred to as “SloMo” video) on selected hardware. The full AVFoundation framework supports high frame rate content.

You determine the capture capabilities of a device using the AVCaptureDeviceFormat class. This class has methods that return the supported media types, frame rates, field of view, maximum zoom factor, whether video stabilization is supported, and more.

  • Capture supports full 720p (1280 x 720 pixels) resolution at 60 frames per second (fps) including – video stabilization and droppable P-frames (a feature of H264 encoded movies, which allow the – movies to play back smoothly even on slower and older hardware.)
  • Playback has enhanced audio support for slow and fast playback, allowing the time pitch of the – audio can be preserved at slower or faster speeds.
  • Editing has full support for scaled edits in mutable compositions.
  • Export provides two options when supporting 60 fps movies. The variable frame rate, slow or fast motion, can be preserved, or the movie and be converted to an arbitrary slower frame rate such as 30 frames per second.

The SloPoke sample code demonstrates the AVFoundation support for fast video capture, determining whether hardware supports high frame rate video capture, playback using various rates and time pitch algorithms, and editing (including setting time scales for portions of a composition).

iOS 7 在特定的硬件中,引入了高帧速率的视频捕获支持(也被称为 “SloMo” 视频)。所有的 AVFoundation 框架都支持高帧速率内容。

使用 AVCaptureDeviceFormat 类确定设备的捕获能力。该类有一个方法,返回支持媒体类型、帧速率、视图因子、最大缩放因子,是否支持视频稳定性等等。

  • 捕获完全支持每秒 60 帧的 720p (1280 x 720 像素)分辨率,包括视频稳定性和可弃用的帧间编码( H264编码特征的电影,使得电影甚至在更慢更老的硬件也能很顺畅的播放)
  • 播放增强了对于慢速和快速播放的音频支持,允许音频的时间间距可以被保存在较慢或者更快的速度。
  • 编辑已全面支持规模可变的组成编辑。
  • 当支持60fps电影,出口提供了两种选择。可变的帧速率,缓慢或者快速的移动,可以保存,或者电影可以被转换为一个任意的较慢的帧速率,比如每秒 30 帧。

SloPoke 示例代码演示了 AVFoundation 支持快速视频捕获,确定硬件是否支持高帧速率视频采集,使用不同速率和时间间距算法播放、编辑(包括设置为一个组件一部分的时间尺度)。

Playback – 播放

An instance of AVPlayer manages most of the playback speed automatically by setting the setRate: method value. The value is used as a multiplier for the playback speed. A value of 1.0 causes normal playback, 0.5 plays back at half speed, 5.0 plays back five times faster than normal, and so on.

AVPlayer 的实例通过设置 setRate: 方法值,自动管理了大部分的播放速度。值被当做播放速度的乘法器使用。值为 1.0 是正常播放,0.5 是播放速度的一半,5.0 表示播放速度是正常速度的 5 倍,等等。

The AVPlayerItem object supports the audioTimePitchAlgorithm property. This property allows you to specify how audio is played when the movie is played at various frame rates using the Time Pitch Algorithm Settings constants.

AVPlayerItem 对象支持 audioTimePitchAlgorithm 属性。此属性允许你指定在使用时距算法设置常量播放不同的帧速率的电影时,音频的播放方式。

The following table shows the supported time pitch algorithms, the quality, whether the algorithm causes the audio to snap to specific frame rates, and the frame rate range that each algorithm supports.


| Time pitch algorithm | Quality | Snaps to specific frame rate | Rate range |
| AVAudioTimePitchAlgorithmLowQualityZeroLatency | Low quality, suitable for fast-forward, rewind, or low quality voice. | YES | 0.5, 0.666667, 0.8, 1.0, 1.25, 1.5, 2.0 rates. |
| AVAudioTimePitchAlgorithmTimeDomain | Modest quality, less expensive computationally, suitable for voice. | NO | 0.5–2x rates. |
| AVAudioTimePitchAlgorithmSpectral | Highest quality, most expensive computationally, preserves the pitch of the original item. | NO | 1/32–32 rates. |
| AVAudioTimePitchAlgorithmVarispeed | High-quality playback with no pitch correction. | NO | 1/32–32 rates. |

Editing – 编辑

When editing, you use the AVMutableComposition class to build temporal edits.

  • Create a new AVMutableComposition instance using the composition class method.
  • Insert your video asset using the insertTimeRange:ofAsset:atTime:error: method.
  • Set the time scale of a portion of the composition using scaleTimeRange:toDuration:

当编辑时,使用 AVMutableComposition 类去建立时间编辑。

Export – 出口

Exporting 60 fps video uses the AVAssetExportSession class to export an asset. The content can be exported using two techniques:

Use the AVAssetExportPresetPassthrough preset to avoid reencoding the movie. It retimes the media with the sections of the media tagged as section 60 fps, section slowed down, or section sped up.

Use a constant frame rate export for maximum playback compatibility. Set the frameDuration property of the video composition to 30 fps. You can also specify the time pitch by using setting the export session’s audioTimePitchAlgorithm property.

使用 AVAssetExportSession 类将 60fps 的视频导出到资产。该内容可以使用两种技术导出:

使用 AVAssetExportPresetPassthrough 预设,避免将电影重新编码。它重新定时媒体,将媒体部分标记为 60fps 的部分,缓慢的部分或者加速的部分。

使用恒定的帧速率导出最大播放兼容性。设置视频组件的 frameDuration 属性为 30fps 。也可以通过设置导出会话的 audioTimePitchAlgorithm 属性指定时间间距。

Recording – 录制

You capture high frame rate video using the AVCaptureMovieFileOutput class, which automatically supports high frame rate recording. It will automatically select the correct H264 pitch level and bit rate.

To do custom recording, you must use the AVAssetWriter class, which requires some additional setup.

使用 AVCaptureMovieFileOutput 类捕获高帧速率的视频,该类自动支持高帧率录制。它会自动选择正确的 H264 的高音和比特率。

做定制的录制,必须使用 AVAssetWriter 类,这需要一些额外的设置。

assetWriterInput.expectsMediaDataInRealTime = YES;

This setting ensures that the capture can keep up with the incoming data.


Export – 输出

To read and write audiovisual assets, you must use the export APIs provided by the AVFoundation framework. The AVAssetExportSession class provides an interface for simple exporting needs, such as modifying the file format or trimming the length of an asset (see Trimming and Transcoding a Movie). For more in-depth exporting needs, use the AVAssetReader and AVAssetWriter classes.

必须使用 AVFoundation 框架提供的导出 APIs 去读写音视频资产。AVAssetExportSession 类为简单输出需要,提供了一个接口,例如修改文件格式或者削减资产的长度(见 Trimming and Transcoding a Movie)。为了更深入的导出需求,使用 AVAssetReaderAVAssetWriter 类。

Use an AVAssetReader when you want to perform an operation on the contents of an asset. For example, you might read the audio track of an asset to produce a visual representation of the waveform. To produce an asset from media such as sample buffers or still images, use an AVAssetWriter object.

当你想对一项资产的内容进行操作时,使用 AVAssetReader 。例如,可以读取一个资产的音频轨道,以产生波形的可视化表示。为了从媒体(比如样品缓冲或者静态图像)生成资产,使用 AVAssetWriter 对象。

Note: The asset reader and writer classes are not intended to be used for real-time processing. In fact, an asset reader cannot even be used for reading from a real-time source like an HTTP live stream. However, if you are using an asset writer with a real-time data source, such as an AVCaptureOutput object, set the expectsMediaDataInRealTime property of your asset writer’s inputs to YES. Setting this property to YES for a non-real-time data source will result in your files not being interleaved properly.

注意:资产 readerwriter 类不打算用到实时处理。实际上,一个资产读取器甚至不能用于从一个类似 HTTP 直播流的实时资源中读取。然而,如果你使用带着实时数据资源的资产写入器,比如 AVCaptureOutput 对象,设置资产写入器入口的 expectsMediaDataInRealTime 属性为 YES。将此属性设置为 YES 的非实时数据源将导致你的文件不能被正确的扫描。

Reading an Asset – 读取资产

Each AVAssetReader object can be associated only with a single asset at a time, but this asset may contain multiple tracks. For this reason, you must assign concrete subclasses of the AVAssetReaderOutput class to your asset reader before you begin reading in order to configure how the media data is read. There are three concrete subclasses of the AVAssetReaderOutput base class that you can use for your asset reading needs: AVAssetReaderTrackOutput, AVAssetReaderAudioMixOutput, and AVAssetReaderVideoCompositionOutput.

每个 AVAssetReader 对象只能与单个资产有关,但这个资产可能包含多个轨道。为此,你必须指定 AVAssetReaderOutput 类的具体子类给你的资产读取器,在你开始按顺序访问你的资产以配置如何读取数据之前。有 AVAssetReaderOutput 基类的 3 个具体子类,可以使用你的资产访问需求 AVAssetReaderTrackOutputAVAssetReaderAudioMixOutputAVAssetReaderVideoCompositionOutput

Creating the Asset Reader – 创建资产读取器

All you need to initialize an AVAssetReader object is the asset that you want to read.

所有你需要去初始化 AVAssetReader 对象是你想要访问的资产。

NSError *outError;
AVAsset *someAsset = <#AVAsset that you want to read#>;
AVAssetReader *assetReader = [AVAssetReader assetReaderWithAsset:someAsset error:&outError];
BOOL success = (assetReader != nil);

Note: Always check that the asset reader returned to you is non-nil to ensure that the asset reader was initialized successfully. Otherwise, the error parameter (outError in the previous example) will contain the relevant error information.

注意:总是要资产读取器是否返回给你的时 non-nil ,以确保资产读取器已经成功被初始化。否则,错误参数(之前的例子中 outError)将会包含有关错误的信息。

Setting Up the Asset Reader Outputs – 建立资产读取器出口

After you have created your asset reader, set up at least one output to receive the media data being read. When setting up your outputs, be sure to set the alwaysCopiesSampleData property to NO. In this way, you reap the benefits of performance improvements. In all of the examples within this chapter, this property could and should be set to NO.

在你创建了资产读取器之后,至少设置一个出口以接收正在读取的媒体数据。当建立你的出口,确保设置 alwaysCopiesSampleData 属性为 NO。这样,你就收获了性能改进的好处。这一章的所有例子中,这个属性可以并且应该被设置为 NO

If you want only to read media data from one or more tracks and potentially convert that data to a different format, use the AVAssetReaderTrackOutput class, using a single track output object for each AVAssetTrack object that you want to read from your asset. To decompress an audio track to Linear PCM with an asset reader, you set up your track output as follows:

如果你只想从一个或多个轨道读取媒体数据,潜在的数据转换为不同的格式,使用 AVAssetReaderTrackOutput 类,每个你想从你的资产中读取 AVAssetTrack 对象都使用单轨道出口对象。将音频轨道解压缩为有资产读取器的 Linear PCM ,建立轨道出口如下:

AVAsset *localAsset = assetReader.asset;
// Get the audio track to read.
AVAssetTrack *audioTrack = [[localAsset tracksWithMediaType:AVMediaTypeAudio] objectAtIndex:0];
// Decompression settings for Linear PCM
NSDictionary *decompressionAudioSettings = @{ AVFormatIDKey : [NSNumber numberWithUnsignedInt:kAudioFormatLinearPCM] };
// Create the output with the audio track and decompression settings.
AVAssetReaderOutput *trackOutput = [AVAssetReaderTrackOutput assetReaderTrackOutputWithTrack:audioTrack outputSettings:decompressionAudioSettings];
// Add the output to the reader if possible.
if ([assetReader canAddOutput:trackOutput])
    [assetReader addOutput:trackOutput];

Note: To read the media data from a specific asset track in the format in which it was stored, pass nil to the outputSettings parameter.

注意:从一个特定的资产轨道读取媒体数据,以它被存储的格式,传 niloutputSettings 参数。

You use the AVAssetReaderAudioMixOutput and AVAssetReaderVideoCompositionOutput classes to read media data that has been mixed or composited together using an AVAudioMix object or AVVideoComposition object, respectively. Typically, these outputs are used when your asset reader is reading from an AVComposition object.

使用 AVAssetReaderAudioMixOutputAVAssetReaderVideoCompositionOutput 类来读取媒体数据,这些媒体数据是分别使用 AVAudioMix 对象或者 AVVideoComposition 对象混合或者组合在一起。通常情况下,当你的资产读取器正在从 AVComposition 读取时,才使用这些出口。

With a single audio mix output, you can read multiple audio tracks from your asset that have been mixed together using an AVAudioMix object. To specify how the audio tracks are mixed, assign the mix to the AVAssetReaderAudioMixOutput object after initialization. The following code displays how to create an audio mix output with all of the audio tracks from your asset, decompress the audio tracks to Linear PCM, and assign an audio mix object to the output. For details on how to configure an audio mix, see Editing.

一个单一音频混合出口,可以从 已经使用 AVAudioMix 对象混合在一起的资产中读取多个音轨。指定音轨是如何被混合在一起的,将混合后的 AVAssetReaderAudioMixOutput 对象初始化。下面的代码显示了如何从资产中创建一个带着所有音轨的音频混合出口,将音轨解压为 Linear PCM,并指定音频混合对象到出口。有如何配置音频混合的细节,请参见 Editing

AVAudioMix *audioMix = <#An AVAudioMix that specifies how the audio tracks from the AVAsset are mixed#>;
// Assumes that assetReader was initialized with an AVComposition object.
AVComposition *composition = (AVComposition *)assetReader.asset;
// Get the audio tracks to read.
NSArray *audioTracks = [composition tracksWithMediaType:AVMediaTypeAudio];
// Get the decompression settings for Linear PCM.
NSDictionary *decompressionAudioSettings = @{ AVFormatIDKey : [NSNumber numberWithUnsignedInt:kAudioFormatLinearPCM] };
// Create the audio mix output with the audio tracks and decompression setttings.
AVAssetReaderOutput *audioMixOutput = [AVAssetReaderAudioMixOutput assetReaderAudioMixOutputWithAudioTracks:audioTracks audioSettings:decompressionAudioSettings];
// Associate the audio mix used to mix the audio tracks being read with the output.
audioMixOutput.audioMix = audioMix;
// Add the output to the reader if possible.
if ([assetReader canAddOutput:audioMixOutput])
    [assetReader addOutput:audioMixOutput];

Note: Passing nil for the audioSettings parameter tells the asset reader to return samples in a convenient uncompressed format. The same is true for the AVAssetReaderVideoCompositionOutput class.

注意:给 audioSettings 参数传递 nil ,告诉资产读取器返回一个方便的未压缩格式的样本。对于 AVAssetReaderVideoCompositionOutput 类同样是可以的。

The video composition output behaves in much the same way: You can read multiple video tracks from your asset that have been composited together using an AVVideoComposition object. To read the media data from multiple composited video tracks and decompress it to ARGB, set up your output as follows:

视频合成输出行为有许多同样的方式:可以从资产(已经被使用 AVVideoComposition 对象合并在一起)读取多个视频轨道。从多个复合视频轨道读取媒体数据,解压缩为 ARGB ,建立出口如下:

AVVideoComposition *videoComposition = <#An AVVideoComposition that specifies how the video tracks from the AVAsset are composited#>;
// Assumes assetReader was initialized with an AVComposition.
AVComposition *composition = (AVComposition *)assetReader.asset;
// Get the video tracks to read.
NSArray *videoTracks = [composition tracksWithMediaType:AVMediaTypeVideo];
// Decompression settings for ARGB.
NSDictionary *decompressionVideoSettings = @{ (id)kCVPixelBufferPixelFormatTypeKey : [NSNumber numberWithUnsignedInt:kCVPixelFormatType_32ARGB], (id)kCVPixelBufferIOSurfacePropertiesKey : [NSDictionary dictionary] };
// Create the video composition output with the video tracks and decompression setttings.
AVAssetReaderOutput *videoCompositionOutput = [AVAssetReaderVideoCompositionOutput assetReaderVideoCompositionOutputWithVideoTracks:videoTracks videoSettings:decompressionVideoSettings];
// Associate the video composition used to composite the video tracks being read with the output.
videoCompositionOutput.videoComposition = videoComposition;
// Add the output to the reader if possible.
if ([assetReader canAddOutput:videoCompositionOutput])
    [assetReader addOutput:videoCompositionOutput];

Reading the Asset’s Media Data – 读取资产媒体数据

To start reading after setting up all of the outputs you need, call the startReading method on your asset reader. Next, retrieve the media data individually from each output using the copyNextSampleBuffer method. To start up an asset reader with a single output and read all of its media samples, do the following:

开始读取后建立所有你需要的出口,在你的资产读取器中调用 startReading 方法。下一步,使用 copyNextSampleBuffer 方法从每个出口分别获取媒体数据。以一个出口启动一个资产读取器,并读取它的所有媒体样本,跟着下面做:

// Start the asset reader up.
[self.assetReader startReading];
BOOL done = NO;
while (!done)
    // Copy the next sample buffer from the reader output.
    CMSampleBufferRef sampleBuffer = [self.assetReaderOutput copyNextSampleBuffer];
    if (sampleBuffer)
        // Do something with sampleBuffer here.
        sampleBuffer = NULL;
        // Find out why the asset reader output couldn't copy another sample buffer.
        if (self.assetReader.status == AVAssetReaderStatusFailed)
            NSError *failureError = self.assetReader.error;
            // Handle the error here.
            // The asset reader output has read all of its samples.
            done = YES;

Writing an Asset – 写入资产

The AVAssetWriter class to write media data from multiple sources to a single file of a specified file format. You don’t need to associate your asset writer object with a specific asset, but you must use a separate asset writer for each output file that you want to create. Because an asset writer can write media data from multiple sources, you must create an AVAssetWriterInput object for each individual track that you want to write to the output file. Each AVAssetWriterInput object expects to receive data in the form of CMSampleBufferRef objects, but if you want to append CVPixelBufferRef objects to your asset writer input, use the AVAssetWriterInputPixelBufferAdaptor class.

AVAssetWriter 类从多个源将媒体数据写入到指定文件格式的单个文件中。不需要将你的资产写入器与一个特定的资产联系起来,但你必须为你要创建的每个输出文件 使用一个独立的资产写入器。因为一个资产写入器可以从多个来源写入媒体数据,你必须为你想写入输出文件的每个独立的轨道创建一个 AVAssetWriterInput 对象。每个 AVAssetWriterInput 对象预计以 CMSampleBufferRef 对象的形成接收数据,但如果你想给你的资产写入器入口 附加 CVPixelBufferRef 对象,使用 AVAssetWriterInputPixelBufferAdaptor 类。

Creating the Asset Writer – 创建资产写入器

To create an asset writer, specify the URL for the output file and the desired file type. The following code displays how to initialize an asset writer to create a QuickTime movie:

为了创建一个资产写入器,为出口文件指定 URL 和所需的文件类型。下面的代码显示了如何初始化一个资产写入器来创建一个 QuickTime 影片:

NSError *outError;
NSURL *outputURL = <#NSURL object representing the URL where you want to save the video#>;
AVAssetWriter *assetWriter = [AVAssetWriter assetWriterWithURL:outputURL
BOOL success = (assetWriter != nil);

Setting Up the Asset Writer Inputs – 建立资产写入器入口

For your asset writer to be able to write media data, you must set up at least one asset writer input. For example, if your source of media data is already vending media samples as CMSampleBufferRef objects, just use the AVAssetWriterInput class. To set up an asset writer input that compresses audio media data to 128 kbps AAC and connect it to your asset writer, do the following:

为你的资产写入器能够写入媒体数据,必须至少设置一个资产写入器入口。例如,如果你的媒体数据源已经以 CMSampleBufferRef 对象声明了声明了媒体样本,只使用 AVAssetWriterInput 类。建立一个资产写入器入口,将音频媒体数据压缩到 128 kbps AAC 并且将它与你的资产写入器连接,跟着下面做:

// Configure the channel layout as stereo.
AudioChannelLayout stereoChannelLayout = {
    .mChannelLayoutTag = kAudioChannelLayoutTag_Stereo,
    .mChannelBitmap = 0,
    .mNumberChannelDescriptions = 0

// Convert the channel layout object to an NSData object.
NSData *channelLayoutAsData = [NSData dataWithBytes:&stereoChannelLayout length:offsetof(AudioChannelLayout, mChannelDescriptions)];

// Get the compression settings for 128 kbps AAC.
NSDictionary *compressionAudioSettings = @{
    AVFormatIDKey         : [NSNumber numberWithUnsignedInt:kAudioFormatMPEG4AAC],
    AVEncoderBitRateKey   : [NSNumber numberWithInteger:128000],
    AVSampleRateKey       : [NSNumber numberWithInteger:44100],
    AVChannelLayoutKey    : channelLayoutAsData,
    AVNumberOfChannelsKey : [NSNumber numberWithUnsignedInteger:2]

// Create the asset writer input with the compression settings and specify the media type as audio.
AVAssetWriterInput *assetWriterInput = [AVAssetWriterInput assetWriterInputWithMediaType:AVMediaTypeAudio outputSettings:compressionAudioSettings];
// Add the input to the writer if possible.
if ([assetWriter canAddInput:assetWriterInput])
    [assetWriter addInput:assetWriterInput];

Note: If you want the media data to be written in the format in which it was stored, pass nil in the outputSettings parameter. Pass nil only if the asset writer was initialized with a fileType of AVFileTypeQuickTimeMovie.

注意:如果你想让媒体数据以它被存储的格式写入,给 outputSettings 参数传 nil。只有资产写入器曾用 AVFileTypeQuickTimeMoviefileType 初始化,才传nil

Your asset writer input can optionally include some metadata or specify a different transform for a particular track using the metadata and transform properties respectively. For an asset writer input whose data source is a video track, you can maintain the video’s original transform in the output file by doing the following:

你的资产写入器入口可以选择性的包含一些元数据 或者 分别使用 metadatatransform 属性为特定的轨道指定不同的变换。对于一个资产写入器的入口,其数据源是一个视频轨道,可以通过下面示例来在输出文件中维持视频的原始变换:

AVAsset *videoAsset = <#AVAsset with at least one video track#>;
AVAssetTrack *videoAssetTrack = [[videoAsset tracksWithMediaType:AVMediaTypeVideo] objectAtIndex:0];
assetWriterInput.transform = videoAssetTrack.preferredTransform;

Note: Set the metadata and transform properties before you begin writing with your asset writer for them to take effect.

注意:在开始用资产写入器写入生效之前,先设置 metadatatransform 属性。

When writing media data to the output file, sometimes you may want to allocate pixel buffers. To do so, use the AVAssetWriterInputPixelBufferAdaptor class. For greatest efficiency, instead of adding pixel buffers that were allocated using a separate pool, use the pixel buffer pool provided by the pixel buffer adaptor. The following code creates a pixel buffer object working in the RGB domain that will use CGImage objects to create its pixel buffers.

当将媒体数据写入输出文件时,有时你可能要分配像素缓冲区。这样做:使用 AVAssetWriterInputPixelBufferAdaptor 类。为了最大的效率,使用由像素缓冲适配器提供的像素缓冲池,代替添加被分配使用一个单独池的像素缓冲区。下面的代码创建一个像素缓冲区对象,在 RGB 色彩下工作,将使用 CGImage 对象创建它的像素缓冲。

NSDictionary *pixelBufferAttributes = @{
    kCVPixelBufferCGImageCompatibilityKey : [NSNumber numberWithBool:YES],
    kCVPixelBufferCGBitmapContextCompatibilityKey : [NSNumber numberWithBool:YES],
    kCVPixelBufferPixelFormatTypeKey : [NSNumber numberWithInt:kCVPixelFormatType_32ARGB]
AVAssetWriterInputPixelBufferAdaptor *inputPixelBufferAdaptor = [AVAssetWriterInputPixelBufferAdaptor assetWriterInputPixelBufferAdaptorWithAssetWriterInput:self.assetWriterInput sourcePixelBufferAttributes:pixelBufferAttributes];

Note: All AVAssetWriterInputPixelBufferAdaptor objects must be connected to a single asset writer input. That asset writer input must accept media data of type AVMediaTypeVideo.

注:所有的 AVAssetWriterInputPixelBufferAdaptor 对象必须连接到一个单独的资产写入器入口。资产写入器入口必须接受 AVMediaTypeVideo 类型的媒体数据。

Writing Media Data – 写入媒体数据

When you have configured all of the inputs needed for your asset writer, you are ready to begin writing media data. As you did with the asset reader, initiate the writing process with a call to the startWriting method. You then need to start a sample-writing session with a call to the startSessionAtSourceTime: method. All writing done by an asset writer has to occur within one of these sessions and the time range of each session defines the time range of media data included from within the source. For example, if your source is an asset reader that is supplying media data read from an AVAsset object and you don’t want to include media data from the first half of the asset, you would do the following:

当你已经为资产写入器配置所有需要的入口时,这时已经准备好开始写入媒体数据。正如在资产读取器所做的,调用 startWriting 方法发起写入过程。然后你需要启动一个样本 – 调用 startSessionAtSourceTime: 方法的写入会话。资产写入器的所有写入都必须在这些会话中发生,并且每个会话的时间范围 定义 包含在来源内媒体数据的时间范围。例如,如果你的来源是一个资产读取器(它从 AVAsset 对象读取到供应的媒体数据),并且你不想包含来自资产的前半部分的媒体数据,你可以像下面这样做:

CMTime halfAssetDuration = CMTimeMultiplyByFloat64(self.asset.duration, 0.5);
[self.assetWriter startSessionAtSourceTime:halfAssetDuration];
//Implementation continues.

Normally, to end a writing session you must call the endSessionAtSourceTime: method. However, if your writing session goes right up to the end of your file, you can end the writing session simply by calling the finishWriting method. To start up an asset writer with a single input and write all of its media data, do the following:

通常,必须调用 endSessionAtSourceTime: 方法结束写入会话。然而,如果你的写入会话正确走到了你的文件末尾,可以简单地通过调用 finishWriting 方法来结束写入会话。要启动一个有单一入口的资产写入器并且写入所有媒体数据。下面示例:

// Prepare the asset writer for writing.
[self.assetWriter startWriting];
// Start a sample-writing session.
[self.assetWriter startSessionAtSourceTime:kCMTimeZero];
// Specify the block to execute when the asset writer is ready for media data and the queue to call it on.
[self.assetWriterInput requestMediaDataWhenReadyOnQueue:myInputSerialQueue usingBlock:^{
    while ([self.assetWriterInput isReadyForMoreMediaData])
        // Get the next sample buffer.
        CMSampleBufferRef nextSampleBuffer = [self copyNextSampleBufferToWrite];
        if (nextSampleBuffer)
            // If it exists, append the next sample buffer to the output file.
            [self.assetWriterInput appendSampleBuffer:nextSampleBuffer];
            nextSampleBuffer = nil;
            // Assume that lack of a next sample buffer means the sample buffer source is out of samples and mark the input as finished.
            [self.assetWriterInput markAsFinished];

The copyNextSampleBufferToWrite method in the code above is simply a stub. The location of this stub is where you would need to insert some logic to return CMSampleBufferRef objects representing the media data that you want to write. One possible source of sample buffers is an asset reader output.

上述代码中的 copyNextSampleBufferToWrite 方法仅仅是一个 stub。这个 stub 的位置就是你需要插入一些逻辑 去返回 CMSampleBufferRef 对象 表示你想要写入的媒体数据。示例缓冲区的可能来源是一个资产读取器出口。

Reencoding Assets – 重新编码资产

You can use an asset reader and asset writer object in tandem to convert an asset from one representation to another. Using these objects, you have more control over the conversion than you do with an AVAssetExportSession object. For example, you can choose which of the tracks you want to be represented in the output file, specify your own output format, or modify the asset during the conversion process. The first step in this process is just to set up your asset reader outputs and asset writer inputs as desired. After your asset reader and writer are fully configured, you start up both of them with calls to the startReading and startWriting methods, respectively. The following code snippet displays how to use a single asset writer input to write media data supplied by a single asset reader output:

可以使用资产读取器和资产写入器对象,以一个表现转换到另一个表现的资产。使用这些对象,你必须比用 AVAssetExportSession 对象有更多的控制转换。例如,你可以选择输出文件中想要显示的轨道,指定你自己的输出格式,或者在转换过程中修改该资产。这个过程中第一步是按需建立你的资产读取器出口和资产写入器入口。资产读取器和写入器充分配置后,分别调用 startReadingstartWriting 方法启动它们。下面的代码片段显示了如何使用一个单一的资产写入器入口去写入 由一个单一的资产读取器出口提供的媒体数据:

NSString *serializationQueueDescription = [NSString stringWithFormat:@"%@ serialization queue", self];

// Create a serialization queue for reading and writing.
dispatch_queue_t serializationQueue = dispatch_queue_create([serializationQueueDescription UTF8String], NULL);

// Specify the block to execute when the asset writer is ready for media data and the queue to call it on.
[self.assetWriterInput requestMediaDataWhenReadyOnQueue:serializationQueue usingBlock:^{
    while ([self.assetWriterInput isReadyForMoreMediaData])
        // Get the asset reader output's next sample buffer.
        CMSampleBufferRef sampleBuffer = [self.assetReaderOutput copyNextSampleBuffer];
        if (sampleBuffer != NULL)
            // If it exists, append this sample buffer to the output file.
            BOOL success = [self.assetWriterInput appendSampleBuffer:sampleBuffer];
            sampleBuffer = NULL;
            // Check for errors that may have occurred when appending the new sample buffer.
            if (!success && self.assetWriter.status == AVAssetWriterStatusFailed)
                NSError *failureError = self.assetWriter.error;
                //Handle the error.
            // If the next sample buffer doesn't exist, find out why the asset reader output couldn't vend another one.
            if (self.assetReader.status == AVAssetReaderStatusFailed)
                NSError *failureError = self.assetReader.error;
                //Handle the error here.
                // The asset reader output must have vended all of its samples. Mark the input as finished.
                [self.assetWriterInput markAsFinished];

Putting It All Together: Using an Asset Reader and Writer in Tandem to Reencode an Asset – 总结:使用资产读取器和写入器串联重新编码资产

This brief code example illustrates how to use an asset reader and writer to reencode the first video and audio track of an asset into a new file. It shows how to:

  • Use serialization queues to handle the asynchronous nature of reading and writing audiovisual data
  • Initialize an asset reader and configure two asset reader outputs, one for audio and one for video
  • Initialize an asset writer and configure two asset writer inputs, one for audio and one for video
  • Use an asset reader to asynchronously supply media data to an asset writer through two different – output/input combinations
  • Use a dispatch group to be notified of completion of the reencoding process
  • Allow a user to cancel the reencoding process once it has begun

这个剪短的代码示例说明如何使用资产读取器和写入器将一个资产的第一个视频和音频轨道重新编码 到一个新文件。它展示了:

  • 使用序列化队列来处理读写视听数据的异步性
  • 初始化一个资产读取器,并配置两个资产读取器出口,一个用于音频,一个用于视频
  • 初始化一个资产写入器,并配置两个资产写入器入口,一个用于音频,一个用于视频
  • 使用一个资产读取器,通过两个不同的 输出 / 输入组合来异步向资产写入器提供媒体数据
  • 使用一个调度组接收重新编码过程的完成的通知
  • 一旦开始,允许用户取消重新编码过程

Note: To focus on the most relevant code, this example omits several aspects of a complete application. To use AVFoundation, you are expected to have enough experience with Cocoa to be able to infer the missing pieces.

注:关注最相关的代码,这个例子中省略了一个完成应用程序的几个方面。为了使用 AVFoundation ,希望你有足够的 Cocoa 经验,能够推断缺少的代码。

Handling the Initial Setup – 处理初始设置

Before you create your asset reader and writer and configure their outputs and inputs, you need to handle some initial setup. The first part of this setup involves creating three separate serialization queues to coordinate the reading and writing process.

在创建资产读取器和写入器和配置它们的出口和入口之前,你需要处理一下初始设置。此设置的第一部分包括创建 3 个独立的序列化队列来协调读写过程。

NSString *serializationQueueDescription = [NSString stringWithFormat:@"%@ serialization queue", self];

// Create the main serialization queue.
self.mainSerializationQueue = dispatch_queue_create([serializationQueueDescription UTF8String], NULL);
NSString *rwAudioSerializationQueueDescription = [NSString stringWithFormat:@"%@ rw audio serialization queue", self];

// Create the serialization queue to use for reading and writing the audio data.
self.rwAudioSerializationQueue = dispatch_queue_create([rwAudioSerializationQueueDescription UTF8String], NULL);
NSString *rwVideoSerializationQueueDescription = [NSString stringWithFormat:@"%@ rw video serialization queue", self];

// Create the serialization queue to use for reading and writing the video data.
self.rwVideoSerializationQueue = dispatch_queue_create([rwVideoSerializationQueueDescription UTF8String], NULL);

The main serialization queue is used to coordinate the starting and stopping of the asset reader and writer (perhaps due to cancellation) and the other two serialization queues are used to serialize the reading and writing by each output/input combination with a potential cancellation.

主序列队列用于协调资产读取器和写入器(可能是由于注销)的启动和停止,其他两个序列队列用于序列化读取器和写入器,通过每一个有潜在注销的输入 / 输出组合。

Now that you have some serialization queues, load the tracks of your asset and begin the reencoding process.


self.asset = <#AVAsset that you want to reencode#>;
self.cancelled = NO;
self.outputURL = <#NSURL representing desired output URL for file generated by asset writer#>;
// Asynchronously load the tracks of the asset you want to read.
[self.asset loadValuesAsynchronouslyForKeys:@[@"tracks"] completionHandler:^{
    // Once the tracks have finished loading, dispatch the work to the main serialization queue.
    dispatch_async(self.mainSerializationQueue, ^{
        // Due to asynchronous nature, check to see if user has already cancelled.
        if (self.cancelled)
        BOOL success = YES;
        NSError *localError = nil;
        // Check for success of loading the assets tracks.
        success = ([self.asset statusOfValueForKey:@"tracks" error:&localError] == AVKeyValueStatusLoaded);
        if (success)
            // If the tracks loaded successfully, make sure that no file exists at the output path for the asset writer.
            NSFileManager *fm = [NSFileManager defaultManager];
            NSString *localOutputPath = [self.outputURL path];
            if ([fm fileExistsAtPath:localOutputPath])
                success = [fm removeItemAtPath:localOutputPath error:&localError];
        if (success)
            success = [self setupAssetReaderAndAssetWriter:&localError];
        if (success)
            success = [self startAssetReaderAndWriter:&localError];
        if (!success)
            [self readingAndWritingDidFinishSuccessfully:success withError:localError];

When the track loading process finishes, whether successfully or not, the rest of the work is dispatched to the main serialization queue to ensure that all of this work is serialized with a potential cancellation. Now all that’s left is to implement the cancellation process and the three custom methods at the end of the previous code listing.

当轨道加载过程结束后,无论成功与否,剩下的工作就是被分配到主序列队列以确保所有的工作都是有潜在注销的序列化。现在,剩下就是实现注销进程和前面的代码清单的结尾处的 3 个自定义方法。

Initializing the Asset Reader and Writer – 初始化资产读取器和写入器

The custom setupAssetReaderAndAssetWriter: method initializes the reader and writer and configures two output/input combinations, one for an audio track and one for a video track. In this example, the audio is decompressed to Linear PCM using the asset reader and compressed back to 128 kbps AAC using the asset writer. The video is decompressed to YUV using the asset reader and compressed to H.264 using the asset writer.

自定义 setupAssetReaderAndAssetWriter: 方法初始化读取器和写入器,并且配置两个输入 / 输出组合,一个用于音频轨道,一个用于视频轨道。在这个例子中,使用资产读取器音频被解压缩到 Linear PCM ,使用资产写入器压缩回 128 kbps AAC 。使用资产读取器将视频解压缩到 YUV ,使用资产写入器压缩为 H.264

- (BOOL)setupAssetReaderAndAssetWriter:(NSError **)outError { // Create and initialize the asset reader. self.assetReader = [[AVAssetReader alloc] initWithAsset:self.asset error:outError]; BOOL success = (self.assetReader != nil); if (success) { // If the asset reader was successfully initialized, do the same for the asset writer. self.assetWriter = [[AVAssetWriter alloc] initWithURL:self.outputURL fileType:AVFileTypeQuickTimeMovie error:outError]; success = (self.assetWriter != nil); } if (success) { // If the reader and writer were successfully initialized, grab the audio and video asset tracks that will be used. AVAssetTrack *assetAudioTrack = nil, *assetVideoTrack = nil; NSArray *audioTracks = [self.asset tracksWithMediaType:AVMediaTypeAudio]; if ([audioTracks count] > 0) assetAudioTrack = [audioTracks objectAtIndex:0]; NSArray *videoTracks = [self.asset tracksWithMediaType:AVMediaTypeVideo]; if ([videoTracks count] > 0) assetVideoTrack = [videoTracks objectAtIndex:0]; if (assetAudioTrack) { // If there is an audio track to read, set the decompression settings to Linear PCM and create the asset reader output. NSDictionary *decompressionAudioSettings = @{ AVFormatIDKey : [NSNumber numberWithUnsignedInt:kAudioFormatLinearPCM] }; self.assetReaderAudioOutput = [AVAssetReaderTrackOutput assetReaderTrackOutputWithTrack:assetAudioTrack outputSettings:decompressionAudioSettings]; [self.assetReader addOutput:self.assetReaderAudioOutput]; // Then, set the compression settings to 128kbps AAC and create the asset writer input. AudioChannelLayout stereoChannelLayout = { .mChannelLayoutTag = kAudioChannelLayoutTag_Stereo, .mChannelBitmap = 0, .mNumberChannelDescriptions = 0 }; NSData *channelLayoutAsData = [NSData dataWithBytes:&stereoChannelLayout length:offsetof(AudioChannelLayout, mChannelDescriptions)]; NSDictionary *compressionAudioSettings = @{ AVFormatIDKey : [NSNumber numberWithUnsignedInt:kAudioFormatMPEG4AAC], AVEncoderBitRateKey : [NSNumber numberWithInteger:128000], AVSampleRateKey : [NSNumber numberWithInteger:44100], AVChannelLayoutKey : channelLayoutAsData, AVNumberOfChannelsKey : [NSNumber numberWithUnsignedInteger:2] }; self.assetWriterAudioInput = [AVAssetWriterInput assetWriterInputWithMediaType:[assetAudioTrack mediaType] outputSettings:compressionAudioSettings]; [self.assetWriter addInput:self.assetWriterAudioInput]; } if (assetVideoTrack) { // If there is a video track to read, set the decompression settings for YUV and create the asset reader output. NSDictionary *decompressionVideoSettings = @{ (id)kCVPixelBufferPixelFormatTypeKey : [NSNumber numberWithUnsignedInt:kCVPixelFormatType_422YpCbCr8], (id)kCVPixelBufferIOSurfacePropertiesKey : [NSDictionary dictionary] }; self.assetReaderVideoOutput = [AVAssetReaderTrackOutput assetReaderTrackOutputWithTrack:assetVideoTrack outputSettings:decompressionVideoSettings]; [self.assetReader addOutput:self.assetReaderVideoOutput]; CMFormatDescriptionRef formatDescription = NULL; // Grab the video format descriptions from the video track and grab the first one if it exists. NSArray *videoFormatDescriptions = [assetVideoTrack formatDescriptions]; if ([videoFormatDescriptions count] > 0) formatDescription = (__bridge CMFormatDescriptionRef)[formatDescriptions objectAtIndex:0]; CGSize trackDimensions = { .width = 0.0, .height = 0.0, }; // If the video track had a format description, grab the track dimensions from there. Otherwise, grab them direcly from the track itself. if (formatDescription) trackDimensions = CMVideoFormatDescriptionGetPresentationDimensions(formatDescription, false, false); else trackDimensions = [assetVideoTrack naturalSize]; NSDictionary *compressionSettings = nil; // If the video track had a format description, attempt to grab the clean aperture settings and pixel aspect ratio used by the video. if (formatDescription) { NSDictionary *cleanAperture = nil; NSDictionary *pixelAspectRatio = nil; CFDictionaryRef cleanApertureFromCMFormatDescription = CMFormatDescriptionGetExtension(formatDescription, kCMFormatDescriptionExtension_CleanAperture); if (cleanApertureFromCMFormatDescription) { cleanAperture = @{ AVVideoCleanApertureWidthKey : (id)CFDictionaryGetValue(cleanApertureFromCMFormatDescription, kCMFormatDescriptionKey_CleanApertureWidth), AVVideoCleanApertureHeightKey : (id)CFDictionaryGetValue(cleanApertureFromCMFormatDescription, kCMFormatDescriptionKey_CleanApertureHeight), AVVideoCleanApertureHorizontalOffsetKey : (id)CFDictionaryGetValue(cleanApertureFromCMFormatDescription, kCMFormatDescriptionKey_CleanApertureHorizontalOffset), AVVideoCleanApertureVerticalOffsetKey : (id)CFDictionaryGetValue(cleanApertureFromCMFormatDescription, kCMFormatDescriptionKey_CleanApertureVerticalOffset) }; } CFDictionaryRef pixelAspectRatioFromCMFormatDescription = CMFormatDescriptionGetExtension(formatDescription, kCMFormatDescriptionExtension_PixelAspectRatio); if (pixelAspectRatioFromCMFormatDescription) { pixelAspectRatio = @{ AVVideoPixelAspectRatioHorizontalSpacingKey : (id)CFDictionaryGetValue(pixelAspectRatioFromCMFormatDescription, kCMFormatDescriptionKey_PixelAspectRatioHorizontalSpacing), AVVideoPixelAspectRatioVerticalSpacingKey : (id)CFDictionaryGetValue(pixelAspectRatioFromCMFormatDescription, kCMFormatDescriptionKey_PixelAspectRatioVerticalSpacing) }; } // Add whichever settings we could grab from the format description to the compression settings dictionary. if (cleanAperture || pixelAspectRatio) { NSMutableDictionary *mutableCompressionSettings = [NSMutableDictionary dictionary]; if (cleanAperture) [mutableCompressionSettings setObject:cleanAperture forKey:AVVideoCleanApertureKey]; if (pixelAspectRatio) [mutableCompressionSettings setObject:pixelAspectRatio forKey:AVVideoPixelAspectRatioKey]; compressionSettings = mutableCompressionSettings; } } // Create the video settings dictionary for H.264. NSMutableDictionary *videoSettings = (NSMutableDictionary *) @{ AVVideoCodecKey : AVVideoCodecH264, AVVideoWidthKey : [NSNumber numberWithDouble:trackDimensions.width], AVVideoHeightKey : [NSNumber numberWithDouble:trackDimensions.height] }; // Put the compression settings into the video settings dictionary if we were able to grab them. if (compressionSettings) [videoSettings setObject:compressionSettings forKey:AVVideoCompressionPropertiesKey]; // Create the asset writer input and add it to the asset writer. self.assetWriterVideoInput = [AVAssetWriterInput assetWriterInputWithMediaType:[videoTrack mediaType] outputSettings:videoSettings]; [self.assetWriter addInput:self.assetWriterVideoInput]; } } return success; }

Reencoding the Asset – 重新编码资产

Provided that the asset reader and writer are successfully initialized and configured, the startAssetReaderAndWriter: method described in Handling the Initial Setup is called. This method is where the actual reading and writing of the asset takes place.

如果资产读取器和写入器成功地初始化和配置,在 Handling the Initial Setup 中发现调用 startAssetReaderAndWriter: 方法。这个方法实际上是资产读写发生的地方。

- (BOOL)startAssetReaderAndWriter:(NSError **)outError
    BOOL success = YES;
    // Attempt to start the asset reader.
    success = [self.assetReader startReading];
    if (!success)
        *outError = [self.assetReader error];
    if (success)
        // If the reader started successfully, attempt to start the asset writer.
        success = [self.assetWriter startWriting];
        if (!success)
            *outError = [self.assetWriter error];

    if (success)
        // If the asset reader and writer both started successfully, create the dispatch group where the reencoding will take place and start a sample-writing session.
        self.dispatchGroup = dispatch_group_create();
        [self.assetWriter startSessionAtSourceTime:kCMTimeZero];
        self.audioFinished = NO;
        self.videoFinished = NO;

        if (self.assetWriterAudioInput)
            // If there is audio to reencode, enter the dispatch group before beginning the work.
            // Specify the block to execute when the asset writer is ready for audio media data, and specify the queue to call it on.
            [self.assetWriterAudioInput requestMediaDataWhenReadyOnQueue:self.rwAudioSerializationQueue usingBlock:^{
                // Because the block is called asynchronously, check to see whether its task is complete.
                if (self.audioFinished)
                BOOL completedOrFailed = NO;
                // If the task isn't complete yet, make sure that the input is actually ready for more media data.
                while ([self.assetWriterAudioInput isReadyForMoreMediaData] && !completedOrFailed)
                    // Get the next audio sample buffer, and append it to the output file.
                    CMSampleBufferRef sampleBuffer = [self.assetReaderAudioOutput copyNextSampleBuffer];
                    if (sampleBuffer != NULL)
                        BOOL success = [self.assetWriterAudioInput appendSampleBuffer:sampleBuffer];
                        sampleBuffer = NULL;
                        completedOrFailed = !success;
                        completedOrFailed = YES;
                if (completedOrFailed)
                    // Mark the input as finished, but only if we haven't already done so, and then leave the dispatch group (since the audio work has finished).
                    BOOL oldFinished = self.audioFinished;
                    self.audioFinished = YES;
                    if (oldFinished == NO)
                        [self.assetWriterAudioInput markAsFinished];

        if (self.assetWriterVideoInput)
            // If we had video to reencode, enter the dispatch group before beginning the work.
            // Specify the block to execute when the asset writer is ready for video media data, and specify the queue to call it on.
            [self.assetWriterVideoInput requestMediaDataWhenReadyOnQueue:self.rwVideoSerializationQueue usingBlock:^{
                // Because the block is called asynchronously, check to see whether its task is complete.
                if (self.videoFinished)
                BOOL completedOrFailed = NO;
                // If the task isn't complete yet, make sure that the input is actually ready for more media data.
                while ([self.assetWriterVideoInput isReadyForMoreMediaData] && !completedOrFailed)
                    // Get the next video sample buffer, and append it to the output file.
                    CMSampleBufferRef sampleBuffer = [self.assetReaderVideoOutput copyNextSampleBuffer];
                    if (sampleBuffer != NULL)
                        BOOL success = [self.assetWriterVideoInput appendSampleBuffer:sampleBuffer];
                        sampleBuffer = NULL;
                        completedOrFailed = !success;
                        completedOrFailed = YES;
                if (completedOrFailed)
                    // Mark the input as finished, but only if we haven't already done so, and then leave the dispatch group (since the video work has finished).
                    BOOL oldFinished = self.videoFinished;
                    self.videoFinished = YES;
                    if (oldFinished == NO)
                        [self.assetWriterVideoInput markAsFinished];
        // Set up the notification that the dispatch group will send when the audio and video work have both finished.
        dispatch_group_notify(self.dispatchGroup, self.mainSerializationQueue, ^{
            BOOL finalSuccess = YES;
            NSError *finalError = nil;
            // Check to see if the work has finished due to cancellation.
            if (self.cancelled)
                // If so, cancel the reader and writer.
                [self.assetReader cancelReading];
                [self.assetWriter cancelWriting];
                // If cancellation didn't occur, first make sure that the asset reader didn't fail.
                if ([self.assetReader status] == AVAssetReaderStatusFailed)
                    finalSuccess = NO;
                    finalError = [self.assetReader error];
                // If the asset reader didn't fail, attempt to stop the asset writer and check for any errors.
                if (finalSuccess)
                    finalSuccess = [self.assetWriter finishWriting];
                    if (!finalSuccess)
                        finalError = [self.assetWriter error];
            // Call the method to handle completion, and pass in the appropriate parameters to indicate whether reencoding was successful.
            [self readingAndWritingDidFinishSuccessfully:finalSuccess withError:finalError];
    // Return success here to indicate whether the asset reader and writer were started successfully.
    return success;

During reencoding, the audio and video tracks are asynchronously handled on individual serialization queues to increase the overall performance of the process, but both queues are contained within the same dispatch group. By placing the work for each track within the same dispatch group, the group can send a notification when all of the work is done and the success of the reencoding process can be determined.


Handling Completion – 处理完成

To handle the completion of the reading and writing process, the readingAndWritingDidFinishSuccessfully: method is called—with parameters indicating whether or not the reencoding completed successfully. If the process didn’t finish successfully, the asset reader and writer are both canceled and any UI related tasks are dispatched to the main queue.

处理读写进程的完成,readingAndWritingDidFinishSuccessfully: 方法被调用,带着参数,指出重新编码是否成功完成。如果进程没有成功完成,该资产读取器和写入器都被取消,任何 UI 相关的任何都被发送到主队列中。

- (void)readingAndWritingDidFinishSuccessfully:(BOOL)success withError:(NSError *)error
    if (!success)
        // If the reencoding process failed, we need to cancel the asset reader and writer.
        [self.assetReader cancelReading];
        [self.assetWriter cancelWriting];
        dispatch_async(dispatch_get_main_queue(), ^{
            // Handle any UI tasks here related to failure.
        // Reencoding was successful, reset booleans.
        self.cancelled = NO;
        self.videoFinished = NO;
        self.audioFinished = NO;
        dispatch_async(dispatch_get_main_queue(), ^{
            // Handle any UI tasks here related to success.

Handling Cancellation – 处理注销

Using multiple serialization queues, you can allow the user of your app to cancel the reencoding process with ease. On the main serialization queue, messages are asynchronously sent to each of the asset reencoding serialization queues to cancel their reading and writing. When these two serialization queues complete their cancellation, the dispatch group sends a notification to the main serialization queue where the cancelled property is set to YES. You might associate the cancel method from the following code listing with a button on your UI.

使用多个序列化队列,你可以提供方便,让你的应用程序的用户取消重新编码进程。在主串行队列,消息被异步发送到每个资产重编码序列化队列,来取消它们的读写。当这两个序列化队列完成它们的注销,调度组向主序列化队列(cancelled 属性被设置为 YES)发送一个通知. 你可能从下面的代码将 cancel 方法与 UI 上的按钮关联起来。

- (void)cancel
    // Handle cancellation asynchronously, but serialize it with the main queue.
    dispatch_async(self.mainSerializationQueue, ^{
        // If we had audio data to reencode, we need to cancel the audio work.
        if (self.assetWriterAudioInput)
            // Handle cancellation asynchronously again, but this time serialize it with the audio queue.
            dispatch_async(self.rwAudioSerializationQueue, ^{
                // Update the Boolean property indicating the task is complete and mark the input as finished if it hasn't already been marked as such.
                BOOL oldFinished = self.audioFinished;
                self.audioFinished = YES;
                if (oldFinished == NO)
                    [self.assetWriterAudioInput markAsFinished];
                // Leave the dispatch group since the audio work is finished now.

        if (self.assetWriterVideoInput)
            // Handle cancellation asynchronously again, but this time serialize it with the video queue.
            dispatch_async(self.rwVideoSerializationQueue, ^{
                // Update the Boolean property indicating the task is complete and mark the input as finished if it hasn't already been marked as such.
                BOOL oldFinished = self.videoFinished;
                self.videoFinished = YES;
                if (oldFinished == NO)
                    [self.assetWriterVideoInput markAsFinished];
                // Leave the dispatch group, since the video work is finished now.
        // Set the cancelled Boolean property to YES to cancel any work on the main queue as well.
        self.cancelled = YES;

Asset Output Settings Assistant – 资产出口设置助手

The AVOutputSettingsAssistant class aids in creating output-settings dictionaries for an asset reader or writer. This makes setup much simpler, especially for high frame rate H264 movies that have a number of specific presets. Listing 5-1 shows an example that uses the output settings assistant to use the settings assistant.

AVOutputSettingsAssistant 类在创建出口时能帮上忙 – 为资产读取器或者写入器设置字典。这使得设置更简单,特别是对于有一些具体的预设的高帧速率 H264 影片。 Listing 5-1 显示了使用输出设置助手去使用设置助手的例子。

Listing 5-1 AVOutputSettingsAssistant sample

AVOutputSettingsAssistant *outputSettingsAssistant = [AVOutputSettingsAssistant outputSettingsAssistantWithPreset:<some preset>];
CMFormatDescriptionRef audioFormat = [self getAudioFormat];

if (audioFormat != NULL)
    [outputSettingsAssistant setSourceAudioFormat:(CMAudioFormatDescriptionRef)audioFormat];

CMFormatDescriptionRef videoFormat = [self getVideoFormat];

if (videoFormat != NULL)
    [outputSettingsAssistant setSourceVideoFormat:(CMVideoFormatDescriptionRef)videoFormat];

CMTime assetMinVideoFrameDuration = [self getMinFrameDuration];
CMTime averageFrameDuration = [self getAvgFrameDuration]

    [outputSettingsAssistant setSourceVideoAverageFrameDuration:averageFrameDuration];
[outputSettingsAssistant setSourceVideoMinFrameDuration:assetMinVideoFrameDuration];

AVAssetWriter *assetWriter = [AVAssetWriter assetWriterWithURL:<some URL> fileType:[outputSettingsAssistant outputFileType] error:NULL];
AVAssetWriterInput *audioInput = [AVAssetWriterInput assetWriterInputWithMediaType:AVMediaTypeAudio outputSettings:[outputSettingsAssistant audioSettings] sourceFormatHint:audioFormat];
AVAssetWriterInput *videoInput = [AVAssetWriterInput assetWriterInputWithMediaType:AVMediaTypeVideo outputSettings:[outputSettingsAssistant videoSettings] sourceFormatHint:videoFormat];

Time and Media Representations – 时间和媒体表现

Time-based audiovisual data, such as a movie file or a video stream, is represented in the AV Foundation framework by AVAsset. Its structure dictates much of the framework works. Several low-level data structures that AV Foundation uses to represent time and media such as sample buffers come from the Core Media framework.

基于视听资料的时间,比如一个电影文件或视频流,在AV Foundation 框架中是由 AVAsset 代表的。它的结构决定了大部分的框架工程。一些低层的数据结构(AV Foundation 使用来表示时间和媒体,比如样本缓冲区)来自 Core Media framework

Representation of Assets – 资产的表示

AVAsset is the core class in the AV Foundation framework. It provides a format-independent abstraction of time-based audiovisual data, such as a movie file or a video stream. The primary relationships are shown in Figure 6-1. In many cases, you work with one of its subclasses: You use the composition subclasses when you create new assets (see Editing), and you use AVURLAsset to create a new asset instance from media at a given URL (including assets from the MPMedia framework or the Asset Library framework—see Using Assets).

AVAssetAV Foundation 框架的核心类。它提供了一个格式 – 与基于时间的视听数据的抽象无关,比如电影文件或视频流。主要的关系如图 6-1 所示。在很多情况下,你都与它的一个子类一起工作:当你创建新的资产(见 Editing)使用组件的子类,并使用 AVURLAsset 从给定 URL 的媒体来创建一个新的资产实例。(包括来自 MPMedia 框架或者 Asset Library framework 的资产,见 Using Assets

AVFoundation Programming Guide(官方文档翻译)完整版中英对照

An asset contains a collection of tracks that are intended to be presented or processed together, each of a uniform media type, including (but not limited to) audio, video, text, closed captions, and subtitles. The asset object provides information about whole resource, such as its duration or title, as well as hints for presentation, such as its natural size. Assets may also have metadata, represented by instances of AVMetadataItem.

资产包含了一组轨道,旨在被一起呈现或一起处理,每一个统一的媒体类型,包括(但不仅限于)音频、视频、文本、隐藏式字幕,以及字幕。资产对象提供关于整个资源的信息,比如它的持续时间或标题,以及用于呈现提示的信息,例如它的自然尺寸。资产也有可能拥有元数据,通过 AVMetadataItem 的实例表示。

A track is represented by an instance of AVAssetTrack, as shown in Figure 6-2. In a typical simple case, one track represents the audio component and another represents the video component; in a complex composition, there may be multiple overlapping tracks of audio and video.

轨道由 AVAssetTrack 的实例表示,如图 6-2 所示。在一个典型简单的情况下,一个轨道代表代表音频组件,另一个代表视频组件;在复杂的组成中,可以存在音频和视频的多个重叠的轨道。

AVFoundation Programming Guide(官方文档翻译)完整版中英对照

A track has a number of properties, such as its type (video or audio), visual and/or audible characteristics (as appropriate), metadata, and timeline (expressed in terms of its parent asset). A track also has an array of format descriptions. The array contains CMFormatDescription objects (see CMFormatDescriptionRef), each of which describes the format of media samples referenced by the track. A track that contains uniform media (for example, all encoded using to the same settings) will provide an array with a count of 1.

轨道有许多属性,比如它的类型(视频或者音频),视觉和 / 或听觉特性(根据需要),元数据和时间轴(在其父资产表示)。一个轨道也有格式描述的数组。数组包含 CMFormatDescription 对象(见 CMFormatDescriptionRef),其中每一个都描述了轨道引用的媒体样本的格式。包含了统一媒体的轨道(例如,所有使用相同设置的编码)将提供计数为 1 的数组。

A track may itself be divided into segments, represented by instances of AVAssetTrackSegment. A segment is a time mapping from the source to the asset track timeline.

轨道自身可以被分成几段,由 AVAssetTrackSegment 的实例表示。一个片段是一个时间映射,从资源到资产轨道时间轴的映射。

Representations of Time – 时间的表示

Time in AV Foundation is represented by primitive structures from the Core Media framework.

AV Foundation 中的时间是由来自 Core Media framework 的原始结构体表示的。

CMTime Represents a Length of Time – CMTime 表示时间的长度

CMTime is a C structure that represents time as a rational number, with a numerator (an int64_t value), and a denominator (an int32_t timescale). Conceptually, the timescale specifies the fraction of a second each unit in the numerator occupies. Thus if the timescale is 4, each unit represents a quarter of a second; if the timescale is 10, each unit represents a tenth of a second, and so on. You frequently use a timescale of 600, because this is a multiple of several commonly used frame rates: 24 fps for film, 30 fps for NTSC (used for TV in North America and Japan), and 25 fps for PAL (used for TV in Europe). Using a timescale of 600, you can exactly represent any number of frames in these systems.

CMTime 是一个 C 语言的结构体,以一个有理数表示时间,有一个分子(一个 int64_t 值)和一个分母(一个 int32_t 时间刻度)。在概念上讲,时间刻度指定一秒中每个单元占据的分数。因此如果时间刻度为 4,每个单元代表一秒的四分之一;如果时间刻度为 10,每个单元代表一秒的十分之一,等等。经常使用时间刻度为 600,因为这是因为这是几种常用帧速率的倍数:24 fps的电影, 30 fpsNTSC(用在北美洲和日本的电视),25 fpsPAL(用于欧洲电视)。使用 600的时间刻度,可以在这些系统中精确的表示任意数量的帧。

In addition to a simple time value, a CMTime structure can represent nonnumeric values: +infinity, -infinity, and indefinite. It can also indicate whether the time been rounded at some point, and it maintains an epoch number.

除了简单的时间值,CMTime 结构体可以表示非数字的值:正无穷大、负无穷大,不确定的。它也可以表示时间在哪一位约等于,并且它能保持一个纪元数字。

Using CMTime – 使用 CMTime

You create a time using CMTimeMake or one of the related functions such as CMTimeMakeWithSeconds (which allows you to create a time using a float value and specify a preferred timescale). There are several functions for time-based arithmetic and for comparing times, as illustrated in the following example:

使用 CMTimeMake 或一个相关功能的 来创建一个时间,例如 CMTimeMakeWithSeconds (它允许你使用浮点值来创建一个时间,并指定一个首选时间刻度)。有基于时间算术的和比较时间的几个功能,如下面的示例所示:

CMTime time1 = CMTimeMake(200, 2); // 200 half-seconds
CMTime time2 = CMTimeMake(400, 4); // 400 quarter-seconds

// time1 and time2 both represent 100 seconds, but using different timescales.
if (CMTimeCompare(time1, time2) == 0) {
    NSLog(@"time1 and time2 are the same");

Float64 float64Seconds = 200.0 / 3;
CMTime time3 = CMTimeMakeWithSeconds(float64Seconds , 3); // 66.66... third-seconds
time3 = CMTimeMultiply(time3, 3);
// time3 now represents 200 seconds; next subtract time1 (100 seconds).
time3 = CMTimeSubtract(time3, time1);

if (CMTIME_COMPARE_INLINE(time2, ==, time3)) {
    NSLog(@"time2 and time3 are the same");

For a list of all the available functions, see CMTime Reference.

有关所有可用的功能列表,请参阅 CMTime Reference

Special Values of CMTime – CMTime 的特殊值

Core Media provides constants for special values: kCMTimeZero, kCMTimeInvalid, kCMTimePositiveInfinity, and kCMTimeNegativeInfinity. There are many ways in which a CMTime structure can, for example, represent a time that is invalid. To test whether a CMTime is valid, or a nonnumeric value, you should use an appropriate macro, such as CMTIME_IS_INVALID, CMTIME_IS_POSITIVE_INFINITY, or CMTIME_IS_INDEFINITE.

Core Media 提供了特殊值的常量:kCMTimeZerokCMTimeInvalidkCMTimePositiveInfinity,以及 kCMTimeNegativeInfinity。有许多方法,例如,其中 CMTime 结构体可以表示一个无效的时间。为了测试CMTime 是否是无效的,或者是一个非数字值,应该使用一个适当的宏,比如 CMTIME_IS_INVALIDCMTIME_IS_POSITIVE_INFINITY,或者 CMTIME_IS_INDEFINITE

CMTime myTime = <#Get a CMTime#>;
if (CMTIME_IS_INVALID(myTime)) {
    // Perhaps treat this as an error; display a suitable alert to the user.

You should not compare the value of an arbitrary CMTime structure with kCMTimeInvalid.

你不应该将一个任意的 CMTime 结构体的值与 kCMTimeInvalid 比较。

Representing CMTime as an Object – CMTime表示为一个对象

If you need to use CMTime structures in annotations or Core Foundation containers, you can convert a CMTime structure to and from a CFDictionary opaque type (see CFDictionaryRef) using the CMTimeCopyAsDictionary and CMTimeMakeFromDictionary functions, respectively. You can also get a string representation of a CMTime structure using the CMTimeCopyDescription function.

如果你需要在注释或者 Core Foundation 容器中使用 CMTime 结构体,可以使用 CMTimeCopyAsDictionaryCMTime 结构体转换,使用 CMTimeMakeFromDictionary 从一个 CFDictionary 不透明的类型(见 CFDictionaryRef)。使用 CMTimeCopyDescription 函数可以得到一个 CMTime 结构体的字符串表示。

Epochs – 纪元

The epoch number of a CMTime structure is usually set to 0, but you can use it to distinguish unrelated timelines. For example, the epoch could be incremented through each cycle using a presentation loop, to differentiate between time N in loop 0 and time N in loop 1.

CMTime 结构体的纪元数量通常设置为 0,但是你可以用它来区分不相关的时间轴。例如,纪元可以通过使用演示循环每个周期递增,区分循环0中的时间 N与循环1中的时间 N

CMTimeRange Represents a Time Range – CMTimeRange表示一个时间范围

CMTimeRange is a C structure that has a start time and duration, both expressed as CMTime structures. A time range does not include the time that is the start time plus the duration.

You create a time range using CMTimeRangeMake or CMTimeRangeFromTimeToTime. There are constraints on the value of the CMTime epochs:

  • CMTimeRange structures cannot span different epochs.
  • The epoch in a CMTime structure that represents a timestamp may be nonzero, but you can only – perform range operations (such as CMTimeRangeGetUnion) on ranges whose start fields have the – same epoch.
  • The epoch in a CMTime structure that represents a duration should always be 0, and the value must be nonnegative.

CMTimeRange 是一个 C 语言结构体,有开始时间和持续时间,即表示为 CMTime 结构体。时间范围不包括开始时间加上持续时间。

使用 CMTimeRangeMake 或者 CMTimeRangeFromTimeToTime 创建一个时间范围。有关 CMTime 纪元的值,有一些约束:

  • CMTimeRange 结构体不能跨越不同的纪元。
  • CMTime 结构体中的纪元,表示一个时间戳可能是非零,但你只能在其开始字段具有相同纪元的范围内执行范围操作(比如 CMTimeRangeGetUnion)。
  • CMTime 结构体中的纪元,表示持续时间应该总是为 0,并且值必须是非负数。

Working with Time Ranges – 与时间范围工作

Core Media provides functions you can use to determine whether a time range contains a given time or other time range, to determine whether two time ranges are equal, and to calculate unions and intersections of time ranges, such as CMTimeRangeContainsTime, CMTimeRangeEqual, CMTimeRangeContainsTimeRange, and CMTimeRangeGetUnion.

Core Media 提供了一些功能,可用于确定一个时间范围是否包含一个特定的时间或其他时间范围,确定两个时间范围是否相等,并计算时间范围的接口和相交范围,比如 CMTimeRangeContainsTimeCMTimeRangeEqualCMTimeRangeContainsTimeRange,以及 CMTimeRangeGetUnion

Given that a time range does not include the time that is the start time plus the duration, the following expression always evaluates to false:

由于时间范围不包括开始时间加上持续时间,下面的表达式的结果总是为 false

CMTimeRangeContainsTime(range, CMTimeRangeGetEnd(range))

For a list of all the available functions, see CMTimeRange Reference.

有关所有可用功能的列表,请参阅 CMTimeRange Reference

Special Values of CMTimeRange – CMTimeRange 的特殊值

Core Media provides constants for a zero-length range and an invalid range, kCMTimeRangeZero and kCMTimeRangeInvalid, respectively. There are many ways, though in which a CMTimeRange structure can be invalid, or zero—or indefinite (if one of the CMTime structures is indefinite. If you need to test whether a CMTimeRange structure is valid, zero, or indefinite, you should use an appropriate macro: CMTIMERANGE_IS_VALID, CMTIMERANGE_IS_INVALID, CMTIMERANGE_IS_EMPTY, or CMTIMERANGE_IS_EMPTY.

Core Media 分别提供一个长度为 0 的范围和一个无效范围,就是 kCMTimeRangeZerokCMTimeRangeInvalid。有很多种方法,尽管 CMTimeRange 结构可以是无效的,或为零,或是不确定的(如果CMTime 结构是不确定的)。如果你需要测试 “CMTimeRange` 结构体是否是有效的,零,或者不确定,你应该使用适当的宏:CMTIMERANGE_IS_VALIDCMTIMERANGE_IS_INVALIDCMTIMERANGE_IS_EMPTY,或者 CMTIMERANGE_IS_INDEFINITE

CMTimeRange myTimeRange = <#Get a CMTimeRange#>;
if (CMTIMERANGE_IS_EMPTY(myTimeRange)) {
    // The time range is zero.

You should not compare the value of an arbitrary CMTimeRange structure with kCMTimeRangeInvalid.

你不应该将任意的 CMTimeRange 结构体的值与 kCMTimeRangeInvalid进行比较。

Representing a CMTimeRange Structure as an Object – 将 CMTimeRange 结构体表示为对象

If you need to use CMTimeRange structures in annotations or Core Foundation containers, you can convert a CMTimeRange structure to and from a CFDictionary opaque type (see CFDictionaryRef) using CMTimeRangeCopyAsDictionary and CMTimeRangeMakeFromDictionary, respectively. You can also get a string representation of a CMTime structure using the CMTimeRangeCopyDescription function.

如果你需要在注释或 Core Foundation 容器中使用 CMTimeRange 结构,可以使用 CMTimeRangeCopyAsDictionary 转换一个 CMTimeRange ,使用 CMTimeRangeMakeFromDictionary 从一个 CFDictionary 不透明类型 (见 CFDictionaryRef)。也可以 CMTimeRangeCopyDescription 功能得到 CMTime 结构的一个字符串表示。

Representations of Media – 媒体的表示

Video data and its associated metadata are represented in AV Foundation by opaque objects from the Core Media framework. Core Media represents video data using CMSampleBuffer (see CMSampleBufferRef). CMSampleBuffer is a Core Foundation-style opaque type; an instance contains the sample buffer for a frame of video data as a Core Video pixel buffer (see CVPixelBufferRef). You access the pixel buffer from a sample buffer using CMSampleBufferGetImageBuffer:

视频数据和其相关的元数据都是被 AV Foundation 中来自 Core Media framework的不透明对象表示。Core Media 表示视频数据 使用 CMSampleBuffer(见 CMSampleBufferRef)。CMSampleBufferCore Foundation 风格的不透明类型;实例包含了用于作为Core Video 像素缓冲(见 CVPixelBufferRef)的视频数据的单帧样品缓冲区。使用 CMSampleBufferGetImageBuffer 从一个样本缓冲区访问像素缓冲。

CVPixelBufferRef pixelBuffer = CMSampleBufferGetImageBuffer(<#A CMSampleBuffer#>);

From the pixel buffer, you can access the actual video data. For an example, see Converting CMSampleBuffer to a UIImage Object.

In addition to the video data, you can retrieve a number of other aspects of the video frame:

  • Timing information. You get accurate timestamps for both the original presentation time and – the decode time using CMSampleBufferGetPresentationTimeStamp and – CMSampleBufferGetDecodeTimeStamp respectively.
  • Format information. The format information is encapsulated in a CMFormatDescription object (- see CMFormatDescriptionRef). From the format description, you can get for example the pixel – type and video dimensions using CMVideoFormatDescriptionGetCodecType and – CMVideoFormatDescriptionGetDimensions respectively.
  • Metadata. Metadata are stored in a dictionary as an attachment. You use CMGetAttachment to retrieve the dictionary:

从像素缓冲区,可以访问实际的视频数据。有个例子,请参阅 Converting CMSampleBuffer to a UIImage Object


CMSampleBufferRef sampleBuffer = <#Get a sample buffer#>;
CFDictionaryRef metadataDictionary =
    CMGetAttachment(sampleBuffer, CFSTR("MetadataDictionary", NULL);
if (metadataDictionary) {
    // Do something with the metadata.

Converting CMSampleBuffer to a UIImage Object – 将 CMSampleBuffer 转化为 UIImage 对象

The following code shows how you can convert a CMSampleBuffer to a UIImage object. You should consider your requirements carefully before using it. Performing the conversion is a comparatively expensive operation. It is appropriate to, for example, create a still image from a frame of video data taken every second or so. You should not use this as a means to manipulate every frame of video coming from a capture device in real time.

下面的代码展示了如何将一个 CMSampleBuffer 转化为一个 UIImage 对象。在使用它之前,应该仔细考虑你的要求。执行转换是一个相对昂贵的操作。例如,比较合适的是 从每一秒左右的视频数据的一帧创建一个静态图像。你不应该使用这个作为一种手段 去操作来自实时捕获设备的视频的每一帧。

// Create a UIImage from sample buffer data
- (UIImage *) imageFromSampleBuffer:(CMSampleBufferRef) sampleBuffer
    // Get a CMSampleBuffer's Core Video image buffer for the media data
    CVImageBufferRef imageBuffer = CMSampleBufferGetImageBuffer(sampleBuffer);
    // Lock the base address of the pixel buffer
    CVPixelBufferLockBaseAddress(imageBuffer, 0);

    // Get the number of bytes per row for the pixel buffer
    void *baseAddress = CVPixelBufferGetBaseAddress(imageBuffer);

    // Get the number of bytes per row for the pixel buffer
    size_t bytesPerRow = CVPixelBufferGetBytesPerRow(imageBuffer);
    // Get the pixel buffer width and height
    size_t width = CVPixelBufferGetWidth(imageBuffer);
    size_t height = CVPixelBufferGetHeight(imageBuffer);

    // Create a device-dependent RGB color space
    CGColorSpaceRef colorSpace = CGColorSpaceCreateDeviceRGB();

    // Create a bitmap graphics context with the sample buffer data
    CGContextRef context = CGBitmapContextCreate(baseAddress, width, height, 8,
                                                 bytesPerRow, colorSpace, kCGBitmapByteOrder32Little | kCGImageAlphaPremultipliedFirst);
    // Create a Quartz image from the pixel data in the bitmap graphics context
    CGImageRef quartzImage = CGBitmapContextCreateImage(context);
    // Unlock the pixel buffer

    // Free up the context and color space

    // Create an image object from the Quartz image
    UIImage *image = [UIImage imageWithCGImage:quartzImage];

    // Release the Quartz image

    return (image);

后记:2016 年 8 月 7 日,16:38,翻译至此结束:本文翻译的版本是官方文档 2015-06-30 版,也就是现在的最新版,翻译成果中还有许多需要校对的地方,希望查阅的小伙伴遇到问题能反馈给我。我也会在接下来的几天写 demo 的同时,再次进行校对。感谢导师和 leader,给我机会完成这项工作。

Sharezer , 版权所有丨如未注明 , 均为原创丨本网站采用BY-NC-SA协议进行授权 , 转载请注明AVFoundation Programming Guide(官方文档翻译)完整版中英对照
喜欢 (0)
分享 (0)

表情 贴图 加粗 删除线 居中 斜体 签到


  • 昵称 (必填)
  • 邮箱 (必填)
  • 网址