Tutorial 47: Picking

Tutorial 47: Picking

In most 3D applications the user will need to click on the screen with the mouse to select or interact with one of the 3D objects in the scene. This process is usually referred to as selection or picking. This tutorial will cover how to implement picking using DirectX 11.

The process of picking involves translating a 2D mouse coordinate position into a vector that is in world space. That vector is then used for intersection checks with all the visible 3D objects. Once the 3D object is determined the test can be further refined to determine exactly which polygon was selected on that 3D object.

For this tutorial we will use a single sphere and do a ray-sphere intersection test whenever the user presses the left mouse button.

Framework

So, the framework looks large for this tutorial, however all the new code is only inside the ApplicationClass. Everything else in the framework is unchanged and is for supporting rendering text, rendering a mouse cursor, and rendering a blue sphere.

Applicationclass.h

////////////////////////////////////////////////////////////////////////////////
// Filename: applicationclass.h
////////////////////////////////////////////////////////////////////////////////
#ifndef _APPLICATIONCLASS_H_
#define _APPLICATIONCLASS_H_


/////////////
// GLOBALS //
/////////////
const bool FULL_SCREEN = false;
const bool VSYNC_ENABLED = true;
const float SCREEN_NEAR = 0.3f;
const float SCREEN_DEPTH = 1000.0f;


///////////////////////
// MY CLASS INCLUDES //
///////////////////////
#include "d3dclass.h"
#include "inputclass.h"
#include "cameraclass.h"
#include "modelclass.h"
#include "lightclass.h"
#include "lightshaderclass.h"
#include "fontshaderclass.h"
#include "fontclass.h"
#include "textclass.h"
#include "bitmapclass.h"
#include "textureshaderclass.h"


////////////////////////////////////////////////////////////////////////////////
// Class name: ApplicationClass
////////////////////////////////////////////////////////////////////////////////
class ApplicationClass
{
public:
    ApplicationClass();
    ApplicationClass(const ApplicationClass&);
    ~ApplicationClass();

    bool Initialize(int, int, HWND);
    void Shutdown();
    bool Frame(InputClass*);

private:
    bool Render();

We have two new functions here. The first one is the general intersection check that forms the vector for checking the intersection and then calls the specific type of intersection check required. The second function is the ray-sphere intersection check function; this function is called by TestIntersection. For other intersection tests such as ray-triangle, ray-rectangle, and so forth you would add them here.

    bool TestIntersection(int, int);
    bool RaySphereIntersect(XMFLOAT3, XMFLOAT3, float);

private:
    D3DClass* m_Direct3D;
    CameraClass* m_Camera;
    ModelClass* m_Model;
    LightClass* m_Light;
    LightShaderClass* m_LightShader;
    FontShaderClass* m_FontShader;
    FontClass* m_Font;
    TextClass* m_TextString;
    BitmapClass* m_MouseBitmap;
    TextureShaderClass* m_TextureShader;
    int m_screenWidth, m_screenHeight;
};

#endif

Applicationclass.cpp

////////////////////////////////////////////////////////////////////////////////
// Filename: applicationclass.cpp
////////////////////////////////////////////////////////////////////////////////
#include "applicationclass.h"


ApplicationClass::ApplicationClass()
{
    m_Direct3D = 0;
    m_Camera = 0;
    m_Model = 0;
    m_Light = 0;
    m_LightShader = 0;
    m_FontShader = 0;
    m_Font = 0;
    m_TextString = 0;
    m_MouseBitmap = 0;
    m_TextureShader = 0;
}


ApplicationClass::ApplicationClass(const ApplicationClass& other)
{
}


ApplicationClass::~ApplicationClass()
{
}


bool ApplicationClass::Initialize(int screenWidth, int screenHeight, HWND hwnd)
{
    char modelFilename[128], textureFilename[128];
    char testString[32];
    bool result;


    // Store the screen width and height.
    m_screenWidth = screenWidth;
    m_screenHeight = screenHeight;

    // Create and initialize the Direct3D object.
    m_Direct3D = new D3DClass;

    result = m_Direct3D->Initialize(screenWidth, screenHeight, VSYNC_ENABLED, hwnd, FULL_SCREEN, SCREEN_DEPTH, SCREEN_NEAR);
    if(!result)
    {
        MessageBox(hwnd, L"Could not initialize Direct3D.", L"Error", MB_OK);
        return false;
    }

    // Create and initialize the camera object.
    m_Camera = new CameraClass;

    m_Camera->SetPosition(0.0f, 0.0f, -10.0f);
    m_Camera->Render();
    m_Camera->RenderBaseViewMatrix();

Load the blue sphere here.

    // Create and initialize the cube model object.
    m_Model = new ModelClass;

    strcpy_s(modelFilename, "../Engine/data/sphere.txt");
    strcpy_s(textureFilename, "../Engine/data/blue.tga");

    result = m_Model->Initialize(m_Direct3D->GetDevice(), m_Direct3D->GetDeviceContext(), modelFilename, textureFilename);
    if(!result)
    {
        MessageBox(hwnd, L"Could not initialize the sphere model object.", L"Error", MB_OK);
        return false;
    }

Setup a basic light for the sphere.

    // Create and initialize the light object.
    m_Light = new LightClass;

    m_Light->SetDirection(0.0f, 0.0f, 1.0f);
    m_Light->SetDiffuseColor(1.0f, 1.0f, 1.0f, 1.0f);

Load the light shader for rendering the sphere.

    // Create and initialize the light shader object.
    m_LightShader  = new LightShaderClass;

    result = m_LightShader ->Initialize(m_Direct3D->GetDevice(), hwnd);
    if(!result)
    {
        MessageBox(hwnd, L"Could not initialize the light shader object.", L"Error", MB_OK);
        return false;
    }

Setup our font rendering related objects so that we can render a text string indicating if we have an intersection or not.

    // Create and initialize the font shader object.
    m_FontShader = new FontShaderClass;

    result = m_FontShader->Initialize(m_Direct3D->GetDevice(), hwnd);
    if(!result)
    {
        MessageBox(hwnd, L"Could not initialize the font shader object.", L"Error", MB_OK);
        return false;
    }

    // Create and initialize the font object.
    m_Font = new FontClass;

    result = m_Font->Initialize(m_Direct3D->GetDevice(), m_Direct3D->GetDeviceContext(), 0);
    if(!result)
    {
        return false;
    }

    // Create and initialize the text string object.
    m_TextString = new TextClass;

    strcpy_s(testString, "Intersection: No");

    result = m_TextString->Initialize(m_Direct3D->GetDevice(), m_Direct3D->GetDeviceContext(), screenWidth, screenHeight, 32, m_Font, testString, 10, 10, 0.0f, 1.0f, 0.0f);
    if(!result)
    {
        return false;
    }

Create a bitmap for the mouse and load the texture shader for rendering the mouse bitmap on the screen.

    // Create and initialize the mouse bitmap object.
    m_MouseBitmap = new BitmapClass;

    strcpy_s(textureFilename, "../Engine/data/mouse.tga");

    result = m_MouseBitmap->Initialize(m_Direct3D->GetDevice(), m_Direct3D->GetDeviceContext(), screenWidth, screenHeight, textureFilename, 50, 50);
    if(!result)
    {
        return false;
    }

    // Create and initialize the texture shader object.
    m_TextureShader  = new TextureShaderClass;

    result = m_TextureShader ->Initialize(m_Direct3D->GetDevice(), hwnd);
    if(!result)
    {
        MessageBox(hwnd, L"Could not initialize the texture shader object.", L"Error", MB_OK);
        return false;
    }

    return true;
}


void ApplicationClass::Shutdown()
{
    // Release the texture shader object.
    if(m_TextureShader)
    {
        m_TextureShader->Shutdown();
        delete m_TextureShader;
        m_TextureShader = 0;
    }

    // Release the mouse bitmap object.
    if(m_MouseBitmap)
    {
        m_MouseBitmap->Shutdown();
        delete m_MouseBitmap;
        m_MouseBitmap = 0;
    }

    // Release the text string object.
    if(m_TextString)
    {
        m_TextString->Shutdown();
        delete m_TextString;
        m_TextString = 0;
    }

    // Release the font object.
    if(m_Font)
    {
        m_Font->Shutdown();
        delete m_Font;
        m_Font = 0;
    }

    // Release the font shader object.
    if(m_FontShader)
    {
        m_FontShader->Shutdown();
        delete m_FontShader;
        m_FontShader = 0;
    }

    // Release the light shader object.
    if(m_LightShader)
    {
        m_LightShader->Shutdown();
        delete m_LightShader;
        m_LightShader = 0;
    }

    // Release the light object.
    if(m_Light)
    {
        delete m_Light;
        m_Light = 0;
    }

    // Release the cube model object.
    if(m_Model)
    {
        m_Model->Shutdown();
        delete m_Model;
        m_Model = 0;
    }

    // Release the camera object.
    if(m_Camera)
    {
        delete m_Camera;
        m_Camera = 0;
    }

    // Release the Direct3D object.
    if(m_Direct3D)
    {
        m_Direct3D->Shutdown();
        delete m_Direct3D;
        m_Direct3D = 0;
    }

    return;
}


bool ApplicationClass::Frame(InputClass* Input)
{
    char testString[32];
    int mouseX, mouseY;
    bool result, intersect;
	

    // Check if the escape key has been pressed, if so quit.
    if(Input->IsEscapePressed() == true)
    {
        return false;
    }

Each frame we get the location of the mouse cursor. Once we have that we can first update where the mouse bitmap is being rendered on the screen. Then after that we use the TestIntersection function to see if the mouse is intersecting with the blue sphere or not. Once we know if we have an intersection or not, we can then update the text string with that information.

    // Get the location of the mouse from the input object.
    Input->GetMouseLocation(mouseX, mouseY);

    // Update the location of the mouse cursor on the screen.
    m_MouseBitmap->SetRenderLocation(mouseX, mouseY);
    
    // Check if the mouse intersects the sphere.
    intersect = TestIntersection(mouseX, mouseY);

    // If it intersects then update the text string message.
    if(intersect == true)
    {
        strcpy_s(testString, "Intersection: Yes");
    }
    else
    {
        strcpy_s(testString, "Intersection: No");
    }

    // Update the text string.
    result = m_TextString->UpdateText(m_Direct3D->GetDeviceContext(), m_Font, testString, 10, 10, 0.0f, 1.0f, 0.0f);
    if(!result)
    {
        return false;
    }

    // Render the final graphics scene.
    result = Render();
    if(!result)
    {
        return false;
    }

    return true;
}


bool ApplicationClass::Render()
{
    XMMATRIX worldMatrix, viewMatrix, projectionMatrix, baseViewMatrix, orthoMatrix, translateMatrix;
    bool result;


    // Clear the buffers to begin the scene.
    m_Direct3D->BeginScene(0.0f, 0.0f, 0.0f, 1.0f);

Get all the matrices we need for the different 3D and 2D rendering we will do each frame.

    // Get the matrices from the camera and d3d objects.
    m_Direct3D->GetWorldMatrix(worldMatrix);
    m_Camera->GetViewMatrix(viewMatrix);
    m_Direct3D->GetProjectionMatrix(projectionMatrix);
    m_Camera->GetBaseViewMatrix(baseViewMatrix);
    m_Direct3D->GetOrthoMatrix(orthoMatrix);

Render the sphere first in the upper left portion of the screen to make sure our picking takes into account the correct location.

    // Translate to the location of the sphere.
    translateMatrix = XMMatrixTranslation(-5.0f, 1.0f, 5.0f);

    // Render the full screen window using the glow shader.
    m_Model->Render(m_Direct3D->GetDeviceContext());

    result = m_LightShader->Render(m_Direct3D->GetDeviceContext(), m_Model->GetIndexCount(), translateMatrix, viewMatrix, projectionMatrix,
                                   m_Model->GetTexture(), m_Light->GetDirection(), m_Light->GetDiffuseColor());
    if(!result)
    {
        return false;
    }

Being 2D rendering and render the text string and the mouse bitmap.

    // Disable the Z buffer and enable alpha blending for 2D rendering.
    m_Direct3D->TurnZBufferOff();
    m_Direct3D->EnableAlphaBlending();

    // Render the text string using the font shader.
    m_TextString->Render(m_Direct3D->GetDeviceContext());

    result = m_FontShader->Render(m_Direct3D->GetDeviceContext(), m_TextString->GetIndexCount(), worldMatrix, baseViewMatrix, orthoMatrix, 
                                  m_Font->GetTexture(), m_TextString->GetPixelColor());
    if(!result)
    {
        return false;
    }

    // Render the mouse cursor using the texture shader.
    result = m_MouseBitmap->Render(m_Direct3D->GetDeviceContext());
    if(!result)
    {
        return false;
    }

    result = m_TextureShader->Render(m_Direct3D->GetDeviceContext(), m_MouseBitmap->GetIndexCount(), worldMatrix, baseViewMatrix, orthoMatrix, m_MouseBitmap->GetTexture());
    if(!result)
    {
        return false;
    }

    // Enable the Z buffer and disable alpha blending now that 2D rendering is complete.
    m_Direct3D->TurnZBufferOn();
    m_Direct3D->DisableAlphaBlending();

    // Present the rendered scene to the screen.
    m_Direct3D->EndScene();

    return true;
}

The TestIntersection function is pretty much the entire focus of this tutorial. It takes as input the 2D mouse coordinates and then forms a vector in 3D space which it uses to then check for an intersection with the sphere. That vector is called the picking ray. The picking ray has an origin and a direction. With the 3D coordinate (origin) and 3D vector/normal (direction) we can create a line in 3D space and find out what it collides with.

In the other HLSL tutorials we are very used to a vertex shader that takes a 3D point (vertice) and moves it from 3D space onto the 2D screen so it can be rendered as a pixel. Well now we are doing the exact opposite and moving a 2D point from the screen into 3D space. So, what we need to do is just reverse our usual process. So, where we would usually take a 3D point from world to view to projection to make a 2D point, we will now instead take a 2D point and go from projection to view to world and turn it into a 3D point.

To do the reverse process we first start by taking the mouse coordinates and moving them into the -1 to +1 range on both axes. When we have that we then divide by the screen aspect using the projection matrix. With that value we can then multiply it by the inverse view matrix (inverse because we are going in reverse direction) to get the direction vector in view space. We can set the origin of the vector in view space to just be the location of the camera.

With the direction vector and origin in view space we can now complete the final process of moving it into 3D world space. To do so we first need to get the world matrix and translate it by the position of the sphere. With the updated world matrix, we once again need to invert it (since the process is going in the opposite direction) and then we can multiply the origin and direction by the inverted world matrix. We also normalize the direction after the multiplication. This gives us the origin and direction of the vector in 3D world space so that we can do tests with other objects that are also in 3D world space.

Now that we have the origin of the vector and the direction of the vector we can perform an intersection test. In this tutorial we perform a ray-sphere intersection test, but you could perform any kind of intersection test now that you have the vector in 3D world space.

bool ApplicationClass::TestIntersection(int mouseX, int mouseY)
{
    XMMATRIX projectionMatrix, viewMatrix, inverseViewMatrix, worldMatrix, inverseWorldMatrix;
    XMFLOAT4X4 pMatrix, iViewMatrix;
    XMVECTOR direction, origin, rayOrigin, rayDirection;
    XMFLOAT3 cameraDirection, cameraOrigin, rayOri, rayDir;
    float pointX, pointY;
    bool intersect;


    // Move the mouse cursor coordinates into the -1 to +1 range.
    pointX = ((2.0f * (float)mouseX) / (float)m_screenWidth) - 1.0f;
    pointY = (((2.0f * (float)mouseY) / (float)m_screenHeight) - 1.0f) * -1.0f;
		
    // Adjust the points using the projection matrix to account for the aspect ratio of the viewport.
    m_Direct3D->GetProjectionMatrix(projectionMatrix);
    XMStoreFloat4x4(&pMatrix, projectionMatrix);
    pointX = pointX / pMatrix._11;
    pointY = pointY / pMatrix._22;

    // Get the inverse of the view matrix.
    m_Camera->GetViewMatrix(viewMatrix);
    inverseViewMatrix = XMMatrixInverse(NULL, viewMatrix);
    XMStoreFloat4x4(&iViewMatrix, inverseViewMatrix);

    // Calculate the direction of the picking ray in view space.
    cameraDirection.x = (pointX * iViewMatrix._11) + (pointY * iViewMatrix._21) + iViewMatrix._31;
    cameraDirection.y = (pointX * iViewMatrix._12) + (pointY * iViewMatrix._22) + iViewMatrix._32;
    cameraDirection.z = (pointX * iViewMatrix._13) + (pointY * iViewMatrix._23) + iViewMatrix._33;
    direction = XMLoadFloat3(&cameraDirection);

    // Get the origin of the picking ray which is the position of the camera.
    cameraOrigin = m_Camera->GetPosition();
    origin = XMLoadFloat3(&cameraOrigin);

    // Get the world matrix and translate to the location of the sphere.
    worldMatrix = XMMatrixTranslation(-5.0f, 1.0f, 5.0f);

    // Now get the inverse of the translated world matrix.
    inverseWorldMatrix = XMMatrixInverse(NULL, worldMatrix);

    // Now transform the ray origin and the ray direction from view space to world space.
    rayOrigin = XMVector3TransformCoord(origin, inverseWorldMatrix);
    rayDirection = XMVector3TransformNormal(direction, inverseWorldMatrix);

    // Normalize the ray direction.
    rayDirection = XMVector3Normalize(rayDirection);

    // Convert the ray origin and direction XMVECTOR to a XMFLOAT3 type.
    XMStoreFloat3(&rayOri, rayOrigin);
    XMStoreFloat3(&rayDir, rayDirection);

    // Now perform the ray-sphere intersection test.
    intersect = RaySphereIntersect(rayOri, rayDir, 1.0f);

    return intersect;
}

This function performs the math of a basic ray-sphere intersection test.

bool ApplicationClass::RaySphereIntersect(XMFLOAT3 rayOrigin, XMFLOAT3 rayDirection, float radius)
{
    float a, b, c, discriminant;


    // Calculate the a, b, and c coefficients.
    a = (rayDirection.x * rayDirection.x) + (rayDirection.y * rayDirection.y) + (rayDirection.z * rayDirection.z);
    b = ((rayDirection.x * rayOrigin.x) + (rayDirection.y * rayOrigin.y) + (rayDirection.z * rayOrigin.z)) * 2.0f;
    c = ((rayOrigin.x * rayOrigin.x) + (rayOrigin.y * rayOrigin.y) + (rayOrigin.z * rayOrigin.z)) - (radius * radius);

    // Find the discriminant.
    discriminant = (b * b) - (4 * a * c);

    // if discriminant is negative the picking ray missed the sphere, otherwise it intersected the sphere.
    if(discriminant < 0.0f)
    {
        return false;
    }

    return true;
}

Summary

We can now perform basic intersection tests with 3D objects in the scene using picking.

To Do Exercises

1. Compile and run the program. Use the mouse to move the cursor over the sphere or the empty space to test intersections. Press escape to quit.

2. Add a cube, triangle, and rectangle model to the scene. Also add intersection test functions for the new three types.

3. Now that you have "bounding box" tests further refine it so that if the line intersects the sphere, it then does a second check for all the triangles in the sphere and highlights the selected triangle.

4. Do the same as number three expect for rectangles and cubes.

5. Place two objects in front of each other, make sure you intersection test returns only the one closest to the camera and ignores the objects behind it if the intersection test returns multiple intersections.

6. Encapsulate all of this picking functionality into a new IntersectorClass.

Source Code

Source Code and Data Files: dx11win10tut47_src.zip

Back to Tutorial Index