iOS3D变换以及透视、阴影详解

Background

最近在学习iOS Core Animation相关的内容，第五章3D变换里有一个Demo，是在屏幕上显示一个正方体，并且为其添加光照阴影，效果图如下所示：

效果图

Problem

可以看到，最终的效果还是可以的，书中给出的代码具体如下：

#import "ViewController.h"
#import <QuartzCore/QuartzCore.h>
#import <GLKit/GLKit.h>
#define LIGHT_DIRECTION 0, 1, -0.5
#define AMBIENT_LIGHT 0.5
@interface ViewController ()
@property (nonatomic, weak) IBOutlet UIView *containerView;
@property (nonatomic, strong) IBOutletCollection(UIView) NSArray *faces;
@end
@implementation ViewController
- (void)applyLightingToFace:(CALayer *)face
{
    //add lighting layer
    CALayer *layer = [CALayer layer];
    layer.frame = face.bounds;
    [face addSublayer:layer];
    //convert face transform to matrix
    //(GLKMatrix4 has the same structure as CATransform3D)
    CATransform3D transform = face.transform;
    GLKMatrix4 matrix4 = *(GLKMatrix4 *)&transform;
    GLKMatrix3 matrix3 = GLKMatrix4GetMatrix3(matrix4);
    //get face normal
    GLKVector3 normal = GLKVector3Make(0, 0, 1);
    normal = GLKMatrix3MultiplyVector3(matrix3, normal);
    normal = GLKVector3Normalize(normal);
    //get dot product with light direction
    GLKVector3 light = GLKVector3Normalize(GLKVector3Make(LIGHT_DIRECTION));
    float dotProduct = GLKVector3DotProduct(light, normal);
    //set lighting layer opacity
    CGFloat shadow = 1 + dotProduct - AMBIENT_LIGHT;
    UIColor *color = [UIColor colorWithWhite:0 alpha:shadow];
    layer.backgroundColor = color.CGColor;
}
- (void)addFace:(NSInteger)index withTransform:(CATransform3D)transform
{
    //get the face view and add it to the container
    UIView *face = self.faces[index];
    [self.containerView addSubview:face];
    //center the face view within the container
    CGSize containerSize = self.containerView.bounds.size;
    face.center = CGPointMake(containerSize.width / 2.0,
                              containerSize.height / 2.0);
    //apply the transform
    face.layer.transform = transform;
    //apply lighting
    [self applyLightingToFace:face.layer];
}
- (void)viewDidLoad
{
    [super viewDidLoad];
    //set up the container sublayer transform
    CATransform3D perspective = CATransform3DIdentity;
    perspective.m34 = -1.0 / 500.0;
    perspective = CATransform3DRotate(perspective, -M_PI_4, 1, 0, 0);
    perspective = CATransform3DRotate(perspective, -M_PI_4, 0, 1, 0);
    self.containerView.layer.sublayerTransform = perspective;
    //add cube face 1
    CATransform3D transform = CATransform3DMakeTranslation(0, 0, 100);
    [self addFace:0 withTransform:transform];
    //add cube face 2
    transform = CATransform3DMakeTranslation(100, 0, 0);
    transform = CATransform3DRotate(transform, M_PI_2, 0, 1, 0);
    [self addFace:1 withTransform:transform];
    //add cube face 3
    //move this code after the setup for face no. 6 to enable button
    transform = CATransform3DMakeTranslation(0, -100, 0);
    transform = CATransform3DRotate(transform, M_PI_2, 1, 0, 0);
    [self addFace:2 withTransform:transform];
    //add cube face 4
    transform = CATransform3DMakeTranslation(0, 100, 0);
    transform = CATransform3DRotate(transform, -M_PI_2, 1, 0, 0);
    [self addFace:3 withTransform:transform];
    //add cube face 5
    transform = CATransform3DMakeTranslation(-100, 0, 0);
    transform = CATransform3DRotate(transform, -M_PI_2, 0, 1, 0);
    [self addFace:4 withTransform:transform];
    //add cube face 6
    transform = CATransform3DMakeTranslation(0, 0, -100);
    transform = CATransform3DRotate(transform, M_PI, 0, 1, 0);
    [self addFace:5 withTransform:transform];
}
@end

但是运行的时候就有两个问题：

程序无法正确运行
即使程序正确运行，为什么会产生效果图那种界面，里面的各种变换是如何实现的

Solve

程序无法正确运行

直接把书上的源码拿出来编译运行的时候，会发现程序显示一片空白，并没有像效果图中那样，经过Debug发现，问题在下面这行代码：

1	GLKMatrix4 matrix4 = (GLKMatrix4 )&transform;

这其中transform是CATransform3D类型的，该类型定义如下：

struct CATransform3D
{
  CGFloat m11, m12, m13, m14;
  CGFloat m21, m22, m23, m24;
  CGFloat m31, m32, m33, m34;
  CGFloat m41, m42, m43, m44;
};
typedef struct CATransform3D CATransform3D;

而CGFloat在64位环境下是double，所以CATransform3D实际上是有16个double元素的结构体，而GLKMatrix4定义如下：

union _GLKMatrix4
{
    struct
    {
        float m00, m01, m02, m03;
        float m10, m11, m12, m13;
        float m20, m21, m22, m23;
        float m30, m31, m32, m33;
    };
    float m[16];
} __attribute__((aligned(16)));
typedef union _GLKMatrix4 GLKMatrix4;

可以看到它里面是16个float元素，所以很明显，直接对transform取址再强转为GLKMatrix4类型会发生截断错误，将该行代码改为下面这样即可：

GLKMatrix4 matrix4 = GLKMatrix4Make(transform.m11, transform.m12, transform.m13, transform.m14, transform.m21, transform.m22, transform.m23, transform.m24, transform.m31, transform.m32, transform.m33, transform.m34, transform.m41, transform.m42, transform.m43, transform.m44);

3D变换 & 透视投影

齐次坐标

坐标系

首先，在三维空间中，对于一个向量$v$以及基$o-xyz$，可以找到一组坐标$(v_1,v_2,v_3)$，使得

$$v = v_1x + v_2y + v_3z \tag{1}$$

而对于一个点p，则可以找到一组坐标$(p_1,p_2,p_3)$，使得

$$p-o = p_1x + p_2y + p_3z\tag{2}$$

从上面对向量和点的表达，我们可以看出为了在坐标系中表示一个点（如p），我们把点的位置看作是对这个基的原点o所进行的一个位移，即一个向量(p-o)（有的书中把这样的向量叫做位置向量——起始于坐标原点的特殊向量），我们在表达这个向量的同时用等价的方式表达出了点p:

$$p = o + p_1 x + p_2 y + p_3 z\tag{3}$$

(1)(3)是坐标系下表达一个向量和点的不同表达方式。这里可以看出，虽然都是用代数分量的形式表达向量和点，但表达一个点比一个向量需要额外的信息。如果我写出一个代数分量表达$(1, 4, 7)$，并不能确定这是一个点还是一个向量。

我们现在把(1)(3)写成矩阵的形式：

$$v=\begin{pmatrix} x & y & z & o \\\end{pmatrix}\bullet\begin{pmatrix}v_1\\v_2\\v_3\\ 0\\\end{pmatrix}$$
$$p=\begin{pmatrix}x & y & z & o \\\end{pmatrix}\bullet\begin{pmatrix}p_1\\p_2\\p_3\\1\\\end{pmatrix}$$

这里$(x,y,z,o)$是坐标基矩阵，右边的列向量分别是向量$v$和点p在基下的坐标。这样，向量和点在同一个基下就有了不同的表达：3D向量的第4个代数分量是0，而3D点的第4个代数分量是1。像这种这种用4个代数分量表示3D几何概念的方式是一种齐次坐标表示。

对于平移T、旋转R、缩放S这3个最常见的仿射变换，平移变换只对于点才有意义，因为普通向量没有位置概念，只有大小和方向，这可以通过下面的式子清楚地看出：

$$\begin{pmatrix}1&0&0&tx\\0&1&0&ty\\0&0&1&tz\\0&0&0&1\\\end{pmatrix}\bullet\begin{pmatrix}x\\y\\z\\1\\\end{pmatrix}=\begin{pmatrix}x+tx\\y+ty\\z+tz\\1\\\end{pmatrix}$$
$$\begin{pmatrix}1&0&0&tx\\0&1&0&ty\\0&0&1&tz\\0&0&0&1\\\end{pmatrix}\bullet\begin{pmatrix}x\\y\\z\\0\\\end{pmatrix}=\begin{pmatrix}x\\y\\z\\0\\\end{pmatrix}$$

而旋转和缩放对于向量和点都有意义，你可以用类似上面齐次表示来检测。从中可以看出，齐次坐标用于仿射变换非常方便。

仿射变换

基础的仿射变换有平移、缩放、旋转三种，下面我们以点为例来简单介绍一下这三种变换对应的变换矩阵。

平移

在三维齐次坐标表示中，任意点$P=(x,y,z)$通过平移距离$t_x,t_y,t_z$加到P的坐标上而平移到位置$P’=(x’,y’,z’)$ ：

$$x’=x+t_x,y’=y+t_y,z’=z+t_z$$

在计算机中，我们为了方便程序处理，通常用矩阵形式来表达三维变换操作，这里，我们用齐次坐标4元列向量的形式表示位置$P和P’$，且变换操作T是4x4矩阵：

$$\begin{pmatrix}x’\\y’\\z’\\1\\\end{pmatrix}＝\begin{pmatrix}1&0&0&t_x\\0&1&0&t_y\\0&0&1&t_z\\0&0&0&1\\\end{pmatrix}\bullet\begin{pmatrix}x\\y\\z\\1\\\end{pmatrix}$$

在三维空间中，对象的平移通过平移定义该对象的各个点然后在新位置重建该对象而实现。对于由一组多边形表面表示的对象，可以将各个表面的顶点进行平移，然后重新显示新位置的面。

旋转

我们可以绕空间的任意轴旋转一个对象，但绕平行于坐标轴的轴的旋转是最容易处理的，首先我们来看绕z轴的旋转：

$$x’=xcos\theta-ysin\theta\\y’=xsin\theta+ycos\theta\\z’=z$$

参数$\theta$表示指定的绕z轴旋转的角度，而z坐标值在该变换中不变。三维z轴旋转方程可以用齐次坐标形式表示如下：

$$\begin{pmatrix}x’\\y’\\z’\\1\\\end{pmatrix}＝\begin{pmatrix}cos\theta&-sin\theta&0&0\\sin\theta&cos\theta&0&0\\0&0&1&0\\0&0&0&1\\\end{pmatrix}\bullet\begin{pmatrix}x\\y\\z\\1\\\end{pmatrix}$$

同理，可以得到x轴旋转公式：

$$y’=ycos\theta-zsin\theta\\z’=ysin\theta+zcos\theta\\x’=x$$

以及y轴旋转公式：

$$z’=zcos\theta-xsin\theta\\x’=zsin\theta+xcos\theta\\y’=y$$

相应的旋转方程如下：

$$\begin{pmatrix}x’\\y’\\z’\\1\\\end{pmatrix}＝\begin{pmatrix}1&0&0&0\\0&cos\theta&-sin\theta&0\\0&sin\theta&cos\theta&0\\0&0&0&1\\\end{pmatrix}\bullet\begin{pmatrix}x\\y\\z\\1\\\end{pmatrix}$$
$$\begin{pmatrix}x’\\y’\\z’\\1\\\end{pmatrix}＝\begin{pmatrix}cos\theta&0&sin\theta&0\\0&1&0&0\\-sin\theta&0&cos\theta&0\\0&0&0&1\\\end{pmatrix}\bullet\begin{pmatrix}x\\y\\z\\1\\\end{pmatrix}$$

缩放

缩放比较简单，进行简单的各坐标轴方向乘积运算即可，缩放方程如下：

$$\begin{pmatrix}x’\\y’\\z’\\1\\\end{pmatrix}＝\begin{pmatrix}s_x&0&0&0\\0&s_y&0&0\\0&0&x_z&0\\0&0&0&1\\\end{pmatrix}\bullet\begin{pmatrix}x\\y\\z\\1\\\end{pmatrix}$$

透视投影

iOS中的CALayer的3D本质上并不能算真正的3D(其视点即观察点或者所谓的照相机的位置是无法变换的)，而只是3D在二维平面上的投影，投影平面就是手机屏幕也就是xy轴组成的平面(注意iOS中为左手坐标系)，那么视点的位置是如何确定的呢？可以通过CATransform3D中的$m_{34}$来间接指定， $m_{34} = -1/z$，其中$z$为观察点在z轴上的值,而Layer的z轴的位置则是通过anchorPoint来指定的，所谓的anchorPoint(锚点)就是在变换中保持不变的点，也就是某个Layer在变换中的原点，xyz三轴相交于此点。

$m_{34} = -1/z$中，当$z$为正的时候，是我们人眼观察现实世界的效果，即在投影平面上表现出近大远小的效果，$z$越靠近原点则这种效果越明显，越远离原点则越来越不明显，当$z$为正无穷大的时候，则失去了近大远小的效果，此时投影线垂直于投影平面，也就是视点在无穷远处，CATransform3D中$m_{34}$的默认值为0，即视点在无穷远处。

下面以点$P(10,0,-10)$为例看一下在iOS的透视投影机制中，三维空间上的一点是如何投影到投影平面（手机屏幕）上的。

透视投影

图中绿色点为点P，视点为观察点，也就是说，我们从视点的位置去观察P点，那么虚线就为投影线，其与x轴的交点即为投影点，通过设置$m_{34}=-1/500$，我们得到投影矩阵如下：

$$\begin{pmatrix}1&0&0&0\\0&1&0&0\\0&0&1&0\\0&0&-\frac{1}{500}&1\\\end{pmatrix}$$

将其与点P相乘得到：

$$\begin{pmatrix}1&0&0&0\\0&1&0&0\\0&0&1&0\\0&0&-\frac{1}{500}&1\\\end{pmatrix}\bullet\begin{pmatrix}10\\0\\-10\\1\end{pmatrix}=\begin{pmatrix}10\\0\\-10\\1.02\end{pmatrix}$$

将得到的点转为齐次坐标即为：

$$\begin{pmatrix}10/1.02\\0\\-10/1.02\\1\end{pmatrix}$$

与上面图中的示意相同。由于是直接投影到xy平面，所以直接将z坐标置为0即可。

Display

正方体显示

简单说完上述的基本理论，再回到问题，看看前文代码中所构建的正方体为何在手机屏幕上显示为那样。

看代码可知，Demo中一共初始化了6个View，分别当作正方体的6面，其中每一面初始化的时候都进行了平移和旋转，可以简单想象一下，6个面经过平移旋转，恰好在三维空间内构建为一个正方体，正方体中心为坐标原点，边长为200，八个顶点坐标分别为：

$$(100,100,100)\\(100,100,-100)\\(100,-100,100)\\(100,-100,-100)\\(-100,100,100)\\(-100,100,-100)\\(-100,-100,100)\\(-100,-100,-100)$$

方便起见，我们只需要知道这八个点经过各种变换后在投影平面的位置就可以知道整个立方体看起来的效果。

再来看看整个容器View进行了哪些变换：

CATransform3D perspective = CATransform3DIdentity;
perspective.m34 = -1.0 / 500.0;
perspective = CATransform3DRotate(perspective, -M_PI_4, 1, 0, 0);
perspective = CATransform3DRotate(perspective, -M_PI_4, 0, 1, 0);
self.containerView.layer.sublayerTransform = perspective;

可以看到，为了达到逼真的效果，程序首先将$m_{34}$进行了设置，然后将整个图层先绕x轴旋转$-45^\circ$，再绕y轴旋转了$-45^\circ$，这样一来，最终的变换矩阵perspective即为：

$$
\begin{pmatrix}1&0&0&0\\0&1&0&0\\0&0&1&0\\0&0&-\frac{1}{500}&1\\\end{pmatrix}\bullet
\begin{pmatrix}
\frac{\sqrt 2}{2}&0&-\frac{\sqrt 2}{2}&0\\
0&1&0&0\\
\frac{\sqrt 2}{2}&0&\frac{\sqrt 2}{2}&0\\
0&0&0&1\\
\end{pmatrix}\bullet
\begin{pmatrix}1&0&0&0\\
0&\frac{\sqrt 2}{2}&\frac{\sqrt 2}{2}&0\\
0&-\frac{\sqrt 2}{2}&\frac{\sqrt 2}{2}&0\\
0&0&0&1\\\end{pmatrix}\\=
\begin{pmatrix}
\frac{\sqrt 2}{2}&0&-\frac{\sqrt 2}{2}&0\\
\frac{1}{2}&\frac{\sqrt 2}{2}&\frac{1}{2}&0\\
\frac{1}{2}&-\frac{\sqrt 2}{2}&\frac{1}{2}&0\\
-\frac{1}{1000}&\frac{\sqrt 2}{1000}&-\frac{1}{1000}&1\\
\end{pmatrix}=perspective
$$

得到了变换矩阵，我们再将上述的八个点一一与矩阵相乘，便能得到它们在投影平面上的位置，这里以$(100,100,100,1)$为例：

$$
\begin{pmatrix}
\frac{\sqrt 2}{2}&0&-\frac{\sqrt 2}{2}&0\\
\frac{1}{2}&\frac{\sqrt 2}{2}&\frac{1}{2}&0\\
\frac{1}{2}&-\frac{\sqrt 2}{2}&\frac{1}{2}&0\\
-\frac{1}{1000}&\frac{\sqrt 2}{1000}&-\frac{1}{1000}&1\\
\end{pmatrix}\bullet
\begin{pmatrix}100\\100\\100\\1\\\end{pmatrix}=
\begin{pmatrix}0\\100+50\sqrt 2\\100-50\sqrt 2\\\frac{8+\sqrt 2}{10}\\\end{pmatrix}\to
\begin{pmatrix}0\\181.33\\0\\1\\\end{pmatrix}
$$

其他七个点计算如下：

$$
(123.90,61.95,0,1)\\
(0,44.47,0,1)\\
(161.71,-82.36,0,1)\\
(-123.90,61.95,0,1)\\
(0,-21.83,0,1)\\
(-161.71,-82.36,0,1)\\
(0,-161.26,0,1)
$$

将上述八个点在xy平面上画出来如下图所示：

perpective

是不是跟前面效果图中的一模一样呢。至此，我们已经了解了三维空间中的对象是如何经过一系列变换投影到手机屏幕上的，下面，我们再来看看正方体不同平面的光照阴影是如何计算的。

光照阴影效果

先来看代码：

- (void)applyLightingToFace:(CALayer *)face
{
    //add lighting layer
    CALayer *layer = [CALayer layer];
    layer.frame = face.bounds;
    [face addSublayer:layer];
    //convert face transform to matrix
    //(GLKMatrix4 has the same structure as CATransform3D)
    CATransform3D transform = face.transform;
    GLKMatrix4 matrix4 = *(GLKMatrix4 *)&transform;
    GLKMatrix3 matrix3 = GLKMatrix4GetMatrix3(matrix4);
    //get face normal
    GLKVector3 normal = GLKVector3Make(0, 0, 1);
    normal = GLKMatrix3MultiplyVector3(matrix3, normal);
    normal = GLKVector3Normalize(normal);
    //get dot product with light direction
    GLKVector3 light = GLKVector3Normalize(GLKVector3Make(LIGHT_DIRECTION));
    float dotProduct = GLKVector3DotProduct(light, normal);
    //set lighting layer opacity
    CGFloat shadow = 1 + dotProduct - AMBIENT_LIGHT;
    UIColor *color = [UIColor colorWithWhite:0 alpha:shadow];
    layer.backgroundColor = color.CGColor;
}

上述代码分为几个步骤：

将齐次坐标的4维变换矩阵降为3维，因为这里我们是用的平行光，不需要平移，而原始变换矩阵的第四维主要是平移以及透视投影用，所以这里直接用前3维即可。
获取照射平面的法向量。因为我们的每个面都是由初始的平行于xy平面的面经过一系列变换而来，而初始平面的法向量是$(0,0,1)$，所以我们只需要将初始法向量经过同样的变换，即可求得照射平面的法向量。
将光照向量与照射平面法向量做点积运算，即可求得光照向量在照射平面的投影，推导如下：

vector

假设$u$在$v$上的投影向量是$u’$，且向量$u$和$v$的夹角为$\theta$。一个向量有两个属性，大小和方向，我们先确定$u’$的大小（即长度，或者模），从$u$的末端做$v$的垂线，那么$d$就是$u’$的长度。而$u’$和$v$的方向是相同的，$v$的方向$v/|v|$也就是$u’$的方向。所以有

$$u’=d\frac{v}{|v|}\tag{1}$$

再求$d$的长度

$$d=|u|cos\theta\tag{2}$$

最后求$cos\theta$

$$cos\theta=\frac{u\cdot v}{|u||v|}\tag{3}$$

得到

$$u’=\frac{u\cdot v}{|v|^2}v$$

这就是最终的投影向量，长度$d$则为

$$|u’|=|u|cos\theta=|u|\frac{u\cdot v}{|u||v|}=\frac{u\cdot v}{|v|}$$

因为$v$是单位向量，长度为一，所以最终的光强反应在代码中就是两个向量做点积。

至此，整个代码都已经非常清楚了，完。

Background

Problem

Solve

程序无法正确运行

3D变换 & 透视投影

齐次坐标

仿射变换

平移

旋转

缩放

透视投影

Display

正方体显示

光照阴影效果

Reference