There are many cases where you would like to be able to click on and select an entity in the 3D world. In order to do this we need to take the screen position of the mouse pointer and convert it to a position in the world and a direction. This gives us a ray that we can then use to check for collisions with the entities in our world. This ray travels from the users eye through the screen (at the mouse position) and into the 3D world.

When we render an entity it is transformed from model space into world space via the world matrix set for each entity (see Matrices). It is then transformed into camera space using the view matrix before finally being rendered with perspective to the 2D screen using the projection matrix. When we click on the 2D screen with the mouse we need to do the reverse, we need to take the 2D point and convert it into a position and ray in our 3D world.

We need to take the mouse position and convert it into a ray. The mouse position returned by the Windows API has co-ordinates ranging from top left of the screen (0,0) to bottom right (width, height). We need to adjust this into a position in our world. So firstly we take into account the mouse range and use the projection matrix values:

D3DXVECTOR3 v;

v.x = ( ( ( 2.0f * sx ) / w ) - 1 ) / matProj._11;

v.y = -( ( ( 2.0f * sy ) / h ) - 1 ) / matProj._22;

v.z = 1.0f;

Where sx and sy are the mouse screen positions. w is the width of the screen and h the height. matProj is the original projection matrix you set (if you have not stored matProj you can retrieve it by using the device->GetTransform(..) method).

*Note: *it is important that the width and height values above are correct. You need to use the size of the back buffer which may not be the same as the window size (due to menu bars, borders etc.). What I normally do is after creating the device I call device->GetViewport( &m_mainViewport ) and store it. The viewport structure holds the correct back buffer width and height.

The next transform determining the 2D representation is the view matrix, so we need to apply the inverse of this matrix to create our ray:

D3DXMATRIX m;

D3DXVECTOR3 rayOrigin,rayDir;

D3DXMatrixInverse( &m, NULL, &matView );

// Transform the screen space pick ray into 3D space

rayDir.x = v.x*m._11 + v.y*m._21 + v.z*m._31;

rayDir.y = v.x*m._12 + v.y*m._22 + v.z*m._32;

rayDir.z = v.x*m._13 + v.y*m._23 + v.z*m._33;

rayOrigin.x = m._41;

rayOrigin.y = m._42;

rayOrigin.z = m._43;

We have now created a ray in world space from our mouse 2D position. It has a position in the world (rayOrigin) and a direction (rayDir). The direction is a vector defining the direction from the eye through the screen into the 3D world. rayOrigin is at the position of the camera.

The next step is to collide this ray with the entities in our world. Normally you would do a bounding box and / or bounding sphere test first to trivially reject entities before doing an actual ray versus mesh intersection test.

Direct3D provides a useful function called D3DXIntersect that takes a mesh and a ray and determines if the ray has hit the mesh. In addition it can return the distance to the collision and other useful values.

The most important thing to remember is that you must do the intersect calculations in model space. So for each world entity we want to test against we must convert the ray into that entities graphics model space. To do this we take the inverse of the entities world matrix and apply it to our ray:

// Use inverse of matrix

D3DXMATRIX matInverse;

D3DXMatrixInverse(&matInverse,NULL,&matWorld);

// Transform ray origin and direction by inv matrix

D3DXVECTOR3 rayObjOrigin,rayObjDirection;

D3DXVec3TransformCoord(&rayObjOrigin,&rayOrigin,&matInverse);

D3DXVec3TransformNormal(&rayObjDirection,&rayDirection,&matInverse);

D3DXVec3Normalize(&rayObjDirection,&rayObjDirection);

We can now call the intersect function on our untransformed graphic mesh data:

BOOL hasHit;

float distanceToCollision;

D3DXIntersect(m_mesh, &rayObjOrigin, &rayObjDirection, &hasHit, NULL, NULL, NULL, &distanceToCollision, NULL, NULL);

- hasHit is true if a collision has occurred.
- distanceToCollision gives the distance from the ray origin to the collision point.

The other parameters allow you to get more detailed values determining the actual collision point.

If you do not have a Direct3D mesh you can use D3DXIntersectTri which takes an array of vertex positions and uses those to calculate collision.

In your world model you will loop through all entities in your world doing the above ray intersection test. If a collision occurs remember the distance and keep looping. If you find another collision closer then that is the one to use (you could click on an entity with another behind it but really just want to pick the closest one).

There are many ways to optimise picking. Firstly there is no need to test against entities outside of the viewing frustum. Secondly you should test against an entities AABB first and trivially reject those that do not collide - this is much quicker (you could also check against a bounding sphere or some other bounding volume). With very complex geometry you could even do your picking against a lower density mesh that is never rendered but purely used for collision purposes.

- Improved Ray Picking - an article by Robert Dunlop