Unity Client

Modeling the real world into the virtual world is the first step in this process. This involves using Unity and Maya in tandem. There are other 3D modelling software which are compatible with Unity such as Blender and Rhinoceros. In other words, if the software has the ability to save work as a .fbx or a .obj it will be compatible with the Unity engine. The objects are modelled individually in pieces in Maya (keep in mind that the polygon count should be low so as not to bog down the system by taking up too much memory), then imported into Unity. Textures and color are added to the objects from within Unity (textures should also be such that they are not too heavy for the overall system). Once the world has been created virtually, an avatar to represent people must be selected/created and imported. These avatars will appear whenever people in the real world are within the bounds of the real world cameras designated for the project. It is best to use an ambiguous figure that has no distinct front, back, top or bottom. Finally, in Unity, cameras must be placed in the virtual world in the same location and height as the “mirror” tablets that will be installed in the real world. The cameras in the Unity scene should include a script to invert the camera’s visual output. In other words, when Unity is running on the tablets in the real world, what is seen should be equivalent to looking into a mirror. It is important to have an organized system for creating and adding new objects to the the virtual world. For example, when modeling a room, each wall is its own object. Each chair, table, door, etc. are their own objects. However, once they are added to the scene, they should all be grouped together under a parent game object which represents that room as a whole. Keeping the modeled objects organized by room are important if future changes ever need to be made or if things get reorganized. Also, it’s important to remember wall thicknesses when modeling. This will keep the overall model accurate in order to make it possible to establish a global coordinate system that will also be used by the cameras responsible for tracking people in the real world. Other views and modes of interacting with the world can then be implemented. Unity provides a first person camera and corresponding controllers. This is a mobile camera which can be controlled using the mouse and the arrow keys. Unity also allows for multiplayer networking. For this project, it means that people on their computers at home can navigate through the world in a first person point of view and their avatars will be seen in the “mirror” tablets that are mounted in the real world space. It the Unity asset store, a multiplayer networking package is provided. This package provides the foundation for the multiplayer functionality, but more work must be done in order to ensure the networking code does not interfere with the SocketIO code. From there, it is time to install and implement SocketIO.

The SocketIO library is used by creating an object with a SocketIOComponent script component attached to it. The script instance can then be obtained from other sections of Unity code by use of the GameObject.Find() and GameObject.getComponent() function calls. Once the instance is obtained, the socket’s callbacks should be specified. This can be done by use of the function, SocketIOComponent.On(string socket name, function f). The first argument specifies the name of the network callback to listen on, and the second argument specifies the function that should be called whenever a network signal on that callback is received. Each such function must accept a single argument, a SocketIOEvent object.

For this project, 5 callbacks were specified: connect, newBlob, updateBlob, removeBlob, and disconnect, each with their own respective function. The connect and disconnect callbacks are self explanatory. The newBlob callback is used whenever a new user is detected from the opencv blob detection program. updateBlob is called whenever an already existing blob receives an update, which should be many times in a second under ideal circumstances. The removeBlob callback is used when tracking is lost on a blob. Each network callback function has access to a SocketIOEvent object, passed in by the SocketIO library as an argument. The SocketIOEvent concerns a single blob, and contains a field “data” which represents a JSON object containing relevant information about the event such as camera id, blob id, and blob position.

The information received from the serve in the SocketIOEvent object is stored in a CSharp Dictionary object in order to keep track of the current model state. The key to the dictionary is a unique integer identifier for the blob, which is a hash computed by combining the blob id and the camera id of the event. Since multiple cameras are in use, simply using the blob id field will not be sufficient, as duplicates will be present. This issue was causing random, nondeterministic behaviour earlier in the semester because a only the blob id was being used. The hash is computed by doing a left bit shift on the cameraId and bitwise ORing that value with the blob id. The current shift amount is 22, which allows for roughly 1,000 different camera ids and over 4,000,000 blobs for each camera. In the event that either of these limits are reached, the key will not necessarily be unique; however, reaching either limit is doubtful in the near future for this project. A simple solution for future expansion would be to use a long for the key to allow more bits for both values.

The values in the dictionary are instances of the Blob class, which store relevant blob information such as blob id, camera id, and the corresponding Unity GameObject that represents the actual blob that is rendered in 3D space. When a signal is given to create a blob, a new GameObject is created. It is an instantiated clone of the personMarker field, which can be quickly specified from within the Unity editor as any prefab object. When updates are received, the coordinates of the clone are altered accordingly, which automatically updates the respective visuals on the client displays.

Computer Vision



Research Infrastructure