cocos2d绘制流程
cocos2d进行绘制主要分成了三个步骤:
- 每一个ui元素提交绘制命令(RenderCommand)到绘制队列(RenderQueue)
 - 绘制队列对将要绘制的元素进行排序,确定绘制顺序
 - Renderer执行绘制命令进行绘制
 
ui树
cocos2d的ui元素是通过一个树状结构来管理的,树的每个节点都是Node及其子类,例如正在显示的一个场景,Scene为树根,通过addChild方法添加的Layer、Sprite等都是其子节点。
cocos2d启动进入主循环后,每次循环都会遍历一边ui树,Director的mainloop方法如下:   
void DisplayLinkDirector::mainLoop()
{
    if (_purgeDirectorInNextLoop)
    {
        _purgeDirectorInNextLoop = false;
        purgeDirector();
    }
    else if (_restartDirectorInNextLoop)
    {
        _restartDirectorInNextLoop = false;
        restartDirector();
    }
    else if (! _invalid)
    {
        drawScene();
        // release the objects
        PoolManager::getInstance()->getCurrentPool()->clear();
    }
}
跟绘制有关的即drawScene方法了,跳转过去,在Director::drawScene()中有:
    if (_runningScene)
    {
#if (CC_USE_PHYSICS || (CC_USE_3D_PHYSICS && CC_ENABLE_BULLET_INTEGRATION) || CC_USE_NAVMESH)
        _runningScene->stepPhysicsAndNavigation(_deltaTime);
#endif
        //clear draw stats
        _renderer->clearDrawStats();
        //render the scene
        _runningScene->render(_renderer);
        _eventDispatcher->dispatchEvent(_eventAfterVisit);
    }
_runningScene就是当前正在显示的Scene,跳转到void Scene::render(Renderer* renderer)中,看到这么一句:
    //visit the scene
    visit(renderer, transform, 0);
这一句就是通过递归的方式来遍历ui树,visit方法在Node类中有实现,Scene并没有重写,所以看看Node的实现:
void Node::visit(Renderer* renderer, const Mat4 &parentTransform, uint32_t parentFlags)
{
    // quick return if not visible. children won't be drawn.
    if (!_visible)
    {
        return;
    }
    uint32_t flags = processParentFlags(parentTransform, parentFlags);
    // IMPORTANT:
    // To ease the migration to v3.0, we still support the Mat4 stack,
    // but it is deprecated and your code should not rely on it
    _director->pushMatrix(MATRIX_STACK_TYPE::MATRIX_STACK_MODELVIEW);
    _director->loadMatrix(MATRIX_STACK_TYPE::MATRIX_STACK_MODELVIEW, _modelViewTransform);
    bool visibleByCamera = isVisitableByVisitingCamera();
    int i = 0;
    if(!_children.empty())
    {
        sortAllChildren();
        // draw children zOrder < 0
        for( ; i < _children.size(); i++ )
        {
            auto node = _children.at(i);
            if (node && node->_localZOrder < 0)
                node->visit(renderer, _modelViewTransform, flags);
            else
                break;
        }
        // self draw
        if (visibleByCamera)
            this->draw(renderer, _modelViewTransform, flags);
        for(auto it=_children.cbegin()+i; it != _children.cend(); ++it)
            (*it)->visit(renderer, _modelViewTransform, flags);
    }
    else if (visibleByCamera)
    {
        this->draw(renderer, _modelViewTransform, flags);
    }
    _director->popMatrix(MATRIX_STACK_TYPE::MATRIX_STACK_MODELVIEW);
}
方法中先对此Node进行了一个可见判断,如果不可见,则不用绘制。如果此Node有子节点,则对子节点进行排序后递归调用visit方法。排序是通过子节点的zOrder来进行的,zOrder越小则排在越前面,先依次递归zOrder小于0的子节点,然后是自身,再然后是zOrder大于0的子节点。可以看出这是一个前序的遍历方法,确保了绘制顺序不出现问题。
Node绘制自身调用了一个draw方法,再到draw方法里:   
void Node::draw(Renderer* renderer, const Mat4 &transform, uint32_t flags)
{
}
嗯,是空的。因为Node并不是用来加载显示纹理的,前面提到显示纹理是交给了Sprite类,所以应该看Sprite类的draw方法:
void Sprite::draw(Renderer *renderer, const Mat4 &transform, uint32_t flags)
{
#if CC_USE_CULLING
    // Don't do calculate the culling if the transform was not updated
    _insideBounds = (flags & FLAGS_TRANSFORM_DIRTY) ? renderer->checkVisibility(transform, _contentSize) : _insideBounds;
    if(_insideBounds)
#endif
    {
        _trianglesCommand.init(_globalZOrder, _texture->getName(), getGLProgramState(), _blendFunc, _polyInfo.triangles, transform, flags);
        renderer->addCommand(&_trianglesCommand);
    }
}
_insideBounds是对此Sprite的可见性判断,如果不在屏幕内则裁剪掉,不参与绘制。
通过判断后,Sprite将一个RenderCommand提交到了Renderer。
RenderCommand
cocos2d中RenderCommand有很多种,在RenderCommand类中定义了:
    enum class Type
    {
        /** Reserved type.*/
        UNKNOWN_COMMAND,
        /** Quad command, used for draw quad.*/
        QUAD_COMMAND,
        /**Custom command, used for calling callback for rendering.*/
        CUSTOM_COMMAND,
        /**Batch command, used for draw batches in texture atlas.*/
        BATCH_COMMAND,
        /**Group command, which can group command in a tree hierarchy.*/
        GROUP_COMMAND,
        /**Mesh command, used to draw 3D meshes.*/
        MESH_COMMAND,
        /**Primitive command, used to draw primitives such as lines, points and triangles.*/
        PRIMITIVE_COMMAND,
        /**Triangles command, used to draw triangles.*/
        TRIANGLES_COMMAND
    };
不同的RenderCommand会用在不同的地方,例如在Sprite里面用到的是TrianglesCommand:
class CC_DLL TrianglesCommand : public RenderCommand
{
public:
    /**The structure of Triangles. */
    struct Triangles
    {
        /**Vertex data pointer.*/
        V3F_C4B_T2F* verts;
        /**Index data pointer.*/
        unsigned short* indices;
        /**The number of vertices.*/
        ssize_t vertCount;
        /**The number of indices.*/
        ssize_t indexCount;
    };
    // ...
protected:
    /**Generate the material ID by textureID, glProgramState, and blend function.*/
    void generateMaterialID();
    /**Generated material id.*/
    uint32_t _materialID;
    /**OpenGL handle for texture.*/
    GLuint _textureID;
    /**GLprogramstate for the commmand. encapsulate shaders and uniforms.*/
    GLProgramState* _glProgramState;
    /**Blend function when rendering the triangles.*/
    BlendFunc _blendType;
    /**Rendered triangles.*/
    Triangles _triangles;
    /**Model view matrix when rendering the triangles.*/
    Mat4 _mv;
};
(删去了一部分代码)其中定义了一个struct类用于描述三角形,其中V3F_C4B_T2F用于描述顶点信息(Vertices:顶点,3个float;Color:颜色,4位;Texture:纹理,2个float),剩下三个分别是索引信息,顶点数和索引数。
所以可以看出RenderCommand实际上是对一些opengl参数的封装,遍历完ui树后,cocos2d再将opengl参数传递给opengl进行绘制。
Renderer
RenderCommand提交完毕后,Renderer会根据绘制命令进行绘制。在Scene的render方法最后,调用了Renderer的render方法:
        renderer->render();
Renderer的render方法如下:
void Renderer::render()
{
    //Uncomment this once everything is rendered by new renderer
    //glClear(GL_COLOR_BUFFER_BIT | GL_DEPTH_BUFFER_BIT);
    //TODO: setup camera or MVP
    _isRendering = true;
    if (_glViewAssigned)
    {
        //Process render commands
        //1. Sort render commands based on ID
        for (auto &renderqueue : _renderGroups)
        {
            renderqueue.sort();
        }
        visitRenderQueue(_renderGroups[0]);
    }
    clean();
    _isRendering = false;
}
Renderer通过维护的RenderQueue数组来访问RenderCommand,默认的主队列是标记为0的RenderQueue,之后的RenderQueue跟GroupCommand有关,每一个GroupCommand会指向一个RenderQueue,这样就可以采用不同的绘制选项。
这里依次访问所有队列进行了排序,看看RenderQueue的sort方法:  
void RenderQueue::sort()
{
    // Don't sort _queue0, it already comes sorted
    std::sort(std::begin(_commands[QUEUE_GROUP::TRANSPARENT_3D]), std::end(_commands[QUEUE_GROUP::TRANSPARENT_3D]), compare3DCommand);
    std::sort(std::begin(_commands[QUEUE_GROUP::GLOBALZ_NEG]), std::end(_commands[QUEUE_GROUP::GLOBALZ_NEG]), compareRenderCommand);
    std::sort(std::begin(_commands[QUEUE_GROUP::GLOBALZ_POS]), std::end(_commands[QUEUE_GROUP::GLOBALZ_POS]), compareRenderCommand);
}
第一句是对3d的绘制命令进行了排序,这里暂且不去了解;后两句是对GlobalZOrder小于0和大于0的命令进行了排序,GlobalZOrder为0的不进行排序。
原因是所有ui元素默认的GlobalZOrder都是0, 在ui树的遍历过程中,实际上已经完成了这些ui元素的排序(前序遍历),故这里不需要再排序。GlobalZOrder的用途实际上是动态改变一个ui元素的绘制次序用的,所以一般情况下不需要对GlobalZOrder进行特殊的设置,这样可以减少性能消耗。
接着render方法调用了visitRenderCommand方法:
void Renderer::visitRenderQueue(RenderQueue& queue)
{
    queue.saveRenderState();
    //
    //Process Global-Z = 0 Queue
    //
    const auto& zZeroQueue = queue.getSubQueue(RenderQueue::QUEUE_GROUP::GLOBALZ_ZERO);
    if (zZeroQueue.size() > 0)
    {
        if(_isDepthTestFor2D)
        {
            glEnable(GL_DEPTH_TEST);
            glDepthMask(true);
            RenderState::StateBlock::_defaultState->setDepthTest(true);
            RenderState::StateBlock::_defaultState->setDepthWrite(true);
        }
        else
        {
            glDisable(GL_DEPTH_TEST);
            glDepthMask(false);
            RenderState::StateBlock::_defaultState->setDepthTest(false);
            RenderState::StateBlock::_defaultState->setDepthWrite(false);
        }
        for (auto it = zZeroQueue.cbegin(); it != zZeroQueue.cend(); ++it)
        {
            processRenderCommand(*it);
        }
        flush();
    }
    queue.restoreRenderState();
}
代码比较长,这里只保留GlobalZOrder为0的,方法开始时是一些opengl参数的处理,比如深度测试的开启与否,然后出现一句processRenderCommand,从名字上看应该是执行绘制命令,开始绘制了:
void Renderer::processRenderCommand(RenderCommand* command)
{
    auto commandType = command->getType();
    if( RenderCommand::Type::TRIANGLES_COMMAND == commandType)
    {
        //Draw if we have batched other commands which are not triangle command
        flush3D();
        flushQuads();
        //Process triangle command
        auto cmd = static_cast<TrianglesCommand*>(command);
        //Draw batched Triangles if necessary
        if(cmd->isSkipBatching() || _filledVertex + cmd->getVertexCount() > VBO_SIZE || _filledIndex + cmd->getIndexCount() > INDEX_VBO_SIZE)
        {
            CCASSERT(cmd->getVertexCount()>= 0 && cmd->getVertexCount() < VBO_SIZE, "VBO for vertex is not big enough, please break the data down or use customized render command");
            CCASSERT(cmd->getIndexCount()>= 0 && cmd->getIndexCount() < INDEX_VBO_SIZE, "VBO for index is not big enough, please break the data down or use customized render command");
            //Draw batched Triangles if VBO is full
            drawBatchedTriangles();
        }
        //Batch Triangles
        _batchedCommands.push_back(cmd);
        fillVerticesAndIndices(cmd);
        if(cmd->isSkipBatching())
        {
            drawBatchedTriangles();
        }
    }
    // ...
    else
    {
        CCLOGERROR("Unknown commands in renderQueue");
    }
}
这个方法主要是根据不同的绘制命令进行不同的处理,代码较多,只保留了TrianglesCommand的处理。对TrianglesCommand的处理是,首先先绘制之前缓存的其它种类命令,例如QuadCommand等,然后如果缓存队列不满或者没有设置成不参与批绘制,则将当前命令添加到缓存队列,留带稍后一起处理。
drawBatchedTriangles就是对已经缓存的TrianglesCommand进行绘制的方法:
void Renderer::drawBatchedTriangles()
{
    //TODO: we can improve the draw performance by insert material switching command before hand.
    int indexToDraw = 0;
    int startIndex = 0;
    //Upload buffer to VBO
    if(_filledVertex <= 0 || _filledIndex <= 0 || _batchedCommands.empty())
    {
        return;
    }
    if (Configuration::getInstance()->supportsShareableVAO())
    {
        //Bind VAO
        GL::bindVAO(_buffersVAO);
        //Set VBO data
        glBindBuffer(GL_ARRAY_BUFFER, _buffersVBO[0]);
        // option 1: subdata
//        glBufferSubData(GL_ARRAY_BUFFER, sizeof(_quads[0])*start, sizeof(_quads[0]) * n , &_quads[start] );
        // option 2: data
//        glBufferData(GL_ARRAY_BUFFER, sizeof(quads_[0]) * (n-start), &quads_[start], GL_DYNAMIC_DRAW);
        // option 3: orphaning + glMapBuffer
        glBufferData(GL_ARRAY_BUFFER, sizeof(_verts[0]) * _filledVertex, nullptr, GL_DYNAMIC_DRAW);
        void *buf = glMapBuffer(GL_ARRAY_BUFFER, GL_WRITE_ONLY);
        memcpy(buf, _verts, sizeof(_verts[0])* _filledVertex);
        glUnmapBuffer(GL_ARRAY_BUFFER);
        glBindBuffer(GL_ARRAY_BUFFER, 0);
        glBindBuffer(GL_ELEMENT_ARRAY_BUFFER, _buffersVBO[1]);
        glBufferData(GL_ELEMENT_ARRAY_BUFFER, sizeof(_indices[0]) * _filledIndex, _indices, GL_STATIC_DRAW);
    }
    else
    {
#define kQuadSize sizeof(_verts[0])
        glBindBuffer(GL_ARRAY_BUFFER, _buffersVBO[0]);
        glBufferData(GL_ARRAY_BUFFER, sizeof(_verts[0]) * _filledVertex , _verts, GL_DYNAMIC_DRAW);
        GL::enableVertexAttribs(GL::VERTEX_ATTRIB_FLAG_POS_COLOR_TEX);
        // vertices
        glVertexAttribPointer(GLProgram::VERTEX_ATTRIB_POSITION, 3, GL_FLOAT, GL_FALSE, kQuadSize, (GLvoid*) offsetof(V3F_C4B_T2F, vertices));
        // colors
        glVertexAttribPointer(GLProgram::VERTEX_ATTRIB_COLOR, 4, GL_UNSIGNED_BYTE, GL_TRUE, kQuadSize, (GLvoid*) offsetof(V3F_C4B_T2F, colors));
        // tex coords
        glVertexAttribPointer(GLProgram::VERTEX_ATTRIB_TEX_COORD, 2, GL_FLOAT, GL_FALSE, kQuadSize, (GLvoid*) offsetof(V3F_C4B_T2F, texCoords));
        glBindBuffer(GL_ELEMENT_ARRAY_BUFFER, _buffersVBO[1]);
        glBufferData(GL_ELEMENT_ARRAY_BUFFER, sizeof(_indices[0]) * _filledIndex, _indices, GL_STATIC_DRAW);
    }
    //Start drawing verties in batch
    for(const auto& cmd : _batchedCommands)
    {
        auto newMaterialID = cmd->getMaterialID();
        if(_lastMaterialID != newMaterialID || newMaterialID == MATERIAL_ID_DO_NOT_BATCH)
        {
            //Draw quads
            if(indexToDraw > 0)
            {
                glDrawElements(GL_TRIANGLES, (GLsizei) indexToDraw, GL_UNSIGNED_SHORT, (GLvoid*) (startIndex*sizeof(_indices[0])) );
                _drawnBatches++;
                _drawnVertices += indexToDraw;
                startIndex += indexToDraw;
                indexToDraw = 0;
            }
            //Use new material
            cmd->useMaterial();
            _lastMaterialID = newMaterialID;
        }
        indexToDraw += cmd->getIndexCount();
    }
    //Draw any remaining triangles
    if(indexToDraw > 0)
    {
        glDrawElements(GL_TRIANGLES, (GLsizei) indexToDraw, GL_UNSIGNED_SHORT, (GLvoid*) (startIndex*sizeof(_indices[0])) );
        _drawnBatches++;
        _drawnVertices += indexToDraw;
    }
    if (Configuration::getInstance()->supportsShareableVAO())
    {
        //Unbind VAO
        GL::bindVAO(0);
    }
    else
    {
        glBindBuffer(GL_ARRAY_BUFFER, 0);
        glBindBuffer(GL_ELEMENT_ARRAY_BUFFER, 0);
    }
    _batchedCommands.clear();
    _filledVertex = 0;
    _filledIndex = 0;
}
这里做了三步处理:
- 将顶点、颜色、纹理等数据缓存到GPU
 - 批绘制三角形
 - 绘制没有被批绘制的剩余三角形
 
比较值得提及是第2点,cocos2d在这里做了一个自动批绘制的功能,即如下这段代码:
    //Start drawing verties in batch
    for(const auto& cmd : _batchedCommands)
    {
        auto newMaterialID = cmd->getMaterialID();
        if(_lastMaterialID != newMaterialID || newMaterialID == MATERIAL_ID_DO_NOT_BATCH)
        {
            //Draw quads
            if(indexToDraw > 0)
            {
                glDrawElements(GL_TRIANGLES, (GLsizei) indexToDraw, GL_UNSIGNED_SHORT, (GLvoid*) (startIndex*sizeof(_indices[0])) );
                _drawnBatches++;
                _drawnVertices += indexToDraw;
                startIndex += indexToDraw;
                indexToDraw = 0;
            }
            //Use new material
            cmd->useMaterial();
            _lastMaterialID = newMaterialID;
        }
        indexToDraw += cmd->getIndexCount();
    }
如果上一条绘制命令所使用的materialID与当前的相同,先不做处理,直到materialID不同的绘制命令出现,则将之前的命令统一绘制。
其原理是将使用同一纹理、同一着色器、同一混合方程等gl参数的命令集中到一起来绘制,这样可以减少glDraw的调用次数,提升性能。  
至于materialID是什么,在TrianglesCommand里面可以看到,其生成的方法为:
void TrianglesCommand::generateMaterialID()
{
    if(_glProgramState->getUniformCount() > 0)
    {
        _materialID = Renderer::MATERIAL_ID_DO_NOT_BATCH;
    }
    else
    {
        int glProgram = (int)_glProgramState->getGLProgram()->getProgram();
        int intArray[4] = { glProgram, (int)_textureID, (int)_blendType.src, (int)_blendType.dst};
        _materialID = XXH32((const void*)intArray, sizeof(intArray), 0);
    }
}
当此ui元素使用了非默认着色器时,其materialID会被设置为Renderer::MATERIAL_ID_DO_NOT_BATCH,将无法参与自动批绘制。
如果采用默认处理,则会通过其使用的着色器、纹理以及混合方程来计算一个hash值,materialID就是这个hash值。就是说,判断两个ui元素能否参与自动批绘制,只需要判断materialID是否相同,如果相同,则说明二者的gl参数一致,可以在同一次glDraw里面完成绘制。
不过自动批绘制只能将相邻的RenderCommand进行合并,所以写程序的时候应该考虑到这一点来进行优化。
总结
cocos2d绘制的大致流程就是以上,即:
每次游戏循环,对ui树进行遍历时,cocos2d不会立即进行绘制,而是让每一个ui元素提交一个绘制命令(RenderCommand)。
遍历完一次ui树后,会生成一个绘制命令队列(RenderQueue),绘制器(Renderer)会先对绘制队列进行排序,然后开始绘制。
绘制过程中采用了自动批绘制的技术对绘制进行优化,即合并使用同一着色器、纹理以及混合方程的绘制命令,以达到性能优化的目的。