Glossary
This glossary defines a list of terms relevant to the PlaySDK. Some are industry terms, and others are specific to the PlaySDK design / development environment. Also see: Acronyms
General Industry Terms
|
PlayMotion Specific Terms
|
In classic Virtual Reality parlance, an avatar is a term used to describe the symbolic representation of the Player within the virtual environment. In the PlayMotion Reference Environmental Specification, the avatar is elegantly represented by the Player's natural shadow. In rear projection scenarios, the avatar is represented by either a VirtualShadow or a synthesized 2d or 3d graphic representative of the Player's position, orientation, and/or gestural actions.
For a more extensive discussion of avatars and alternative representational techniques, please read Avatars and Mirrors, part of the Design Handbook.
Also, Avatar is a most awesome animated television series that aired on Nickelodeon from 2005-2008.
A blob is a term used in Computer Vision to designate a contiguous region of segmented pixels. Its most common use is to track humans or objects in a scene. This is a fairly straightforward task until blob paths intersect eachother, which in the computer vision signal is interpreted as two blobs joining and then separating at some later time as the paths diverge again. When this is the case, one may naively predict , based on past motion tracks, where the enumerated blob will emerge on the other side of the intersection.
For more technical definition, please see Blob Labeling / GetHumans.
The Calibrate Utility is an application that ships with the PlaySDK. It is the central command center for 1) adjusting the camera image and noise reduction 2) aligning the homography of the camera signal with the physical environment, or PlaySpace, and 3) configuring each of the individual computer vision services offered by the PlaySDK.
For more information and documentation, see the Calibrate Utility User Manual.
A camera is an specific class of sensor which is composed of a lens and an image sensor. A camera is to a machine what an eye is to a human. For the purposes of PlayMotion, we are dealing with cameras capable of delivering continual streams of live video data in real-time. In the PlayMotion Reference implementation, the PMSP camera is filtered so that it only sees activity in the infra-red portion of the light spectrum. Speed of camera optics and frame-rate in terms of data transmission are critical factors in the performance and enjoyment of your PlayMotion experience. PlayMotion has performed extensive researching and benchmarking and offers for sale the PMSP, an optical unit optimized for high-performance vision-based experiences.
A key principle of the PlayMotion reference implementation is that "the camera is the dual of the projector," an idea put forth by noted computer vision scientist Ramesh Raskar in the late 1990s. In other words, both the camera and the projector are lensed optical devices; the projector generates a raster video image output stream, the camera senses a raster video image input stream. This understanding lies at the core of the PlayMotion Reference Design, and indeed most all projector-camera systems.
A webcam is also a form of camera. If you intend to use a webcam with the PlaySDK, please read our Practical Considerations for Webcams page.
Computer Graphics (aka CG) is the process of translating a mathematical model of a scene (represented at high level by 3D models, textures, animations, shaders, and lighting rigs) into a live raster image to be output to a display. In this way, Computer Graphics is the inverse of Computer Vision.
Computer Vision (aka CV) is the process of taking a live raster image (represented as a matrix of numeric values) and interpreting it into higher level data abstractions and symbolic objects (such as humans, limbs, faces, props, poses, gestures, etc.). In this way, Computer Vision is the inverse of Computer Graphics (in a similar fashion, the camera is the dual of the projector)
Further, computer vision is the area of research which deals with computers abstracting high level data constructs from video-based data streams. By applying intelligence to the signal, the purpose is to get the machine to recognize objects, humans, poses, and gestures with a decent degree of accuracy. The broader field is called Machine Vision. Computer vision is a subset of Computer Science. Aligned fields include Robotics, Signal Processing, and Artificial Intelligence.
For the purposes of PlayMotion, computer vision uses a camera signal as input, and processes all algorithms in real time.
A connected component is a contiguous grouping of pixels. For instance, a flood fill algorithm would fill all the pixels of a connected component when fed the X,Y coordinates of any positive pixel which is a member of the component.
A display is a general class of devices which display graphical data in response to computer generated (CG) signals. Displays may work on raster, vector, or other custom models. Displays may be reflective (such as projectors), or emissive (such as LCD/plasma flat panels and LED walls). Generally, displays add visible light energy to the environment, and provide humans with data via the eyes.
For a much more in-depth discussion of display types and properties, see Choosing your Display Type.
Display Agnosticism refers to the fact that a PlayMotion system can be configured to use any class of display. A "standard" projector-camera system uses a projector and camera which share a common optical axis (or as near as is physically possible), as can be seen in the reference specification diagram. However, the projection surface can just as easily be replaced with a different class or orientation of display, including:
- Emissive (LED, Plasma, LCD)
- rear projection
In these cases, where the natural shadow of the Player is not visible, it is useful to either activate some form of Virtual Shadow, or place an animated avatar or similar user interface hinting on screen so that the player can quickly and intuitively grasp how their physical movement directly drives game interaction.
DoubleWide™ designates a display setup which affords a PlaySpace area which is twice that of a standard setup. In doing so, it enables Players a far greater freedom of lateral movement, including the ability to run freely across the PlaySpace... design note: running experiences can be significantly more active and different than standing experiences :). This is a really nice feature and generally re-defines what type of experiences one may develop for the system given such broad freedom of lateral movement.
The basic setup involves using the twin outputs offered on most modern graphics cards to drive two displays simultaneously, side by side. The physical dimensions are designated in Calibrate, and the graphics card is configured to split the output signal across two devices. A single camera can be used with a very wide lens covering the entire screen, or optionally developers can upgrade to PlaySDKpro, which includes the TwinCam feature.
The filter graph is at the core of PlayMotion Vision Engine services. It is an XML-configurable funnel of algorithms through which the live camera signal is processed in real-time for the purposes of computer vision. The filter graph is explicitly defined via XML within the siteXML file in your playmotion install directory.
Click here for more information on configuring the filter graph.
Front projection is the classic method of data projection, where the projector is at the back of the room aimed at the projection surface. In front projection scenarios, the Player intercepts the projection beam; this effectively makes their shadow the play avatar within the interactive experience. The converse of front projection is rear projection.
also see:
* display agnostic
* rear projection
* projector
* VirtualShadow
Gesture recognition is the area of research which deals with Computer Vision systems detecting a certain vocabulary as expressed through physical gestures and body language. This mode of interaction was best popularized by the interface which Tom Cruise uses in the movie Minority Report (2002). In HCI terms, specific gestures are linked to specific commands and/or actions which mediate between the physical world and the virtual world. Within the PlayMotion development environment, the Motion Recognition Service is the first foray into gesture recognition.
A GPU is a dedicated chip which is purpose-designed to optimize the performance of 3d graphics generation in real-time. In PlayMotion environments, many of the core Computer Vision routines have been ported to the GPU in order to optimize performance. A fast GPU is critical to developing low-latency, real-time virtual experiences. For recommendations on GPUs for your development and run-time machines, please see Minimum System Requirements.
nVidia has a most excellent book, GPU Gems: Programming Techniques, Tips, and Tricks for Real-Time Graphics, which is available in its entirety online here.
Human Computer Interaction, or HCI, is the knowledge domain of all relationships between humans and machines. HCI is a very broad area of study which includes Computer Vision within it. PlayMotion systems specifically target natural interactions, meaning those which rely on only the body and its inherent movement vocabulary in free space, as opposed to mechanical interfaces such as mice, keyboards, gloves and goggles.
The HumanScale design paradigm centers on the idea that all representations of virtual objects, at the point of contact with the player, should be the exact same size that they would be in real life, thus building a plausible bridge of physicality between the shadow, or avatar of the player, and the virtual object being contacted. Background objects can be given appropriate scaling, thus giving an important depth cues to the player (additional cues can be made by attenuating sound and lighting).
The corollary of this is that the physical environment itself needs to take into account the free access, egress, interaction and full bodily movement of both adult and human players. Where full body interaction is desired, this generally means minimum PlaySpace dimensions of 9' high (as high as a 6' tall player can reach) by 12' wide (enough room for 2-3 adult players to wave their arms without colliding with eachother). Setting the World Size within the Calibrate Utility allows your virtual objects to be scaled and referenced relative to this natural real world metric. The Humans Service is also generally configured to relate to a 1:1 scaled full body human segmentation image based on the configured World Size.
See also: Human Scale Design
Homography is a mapping of one gemoetric coordinate space to another. For practical purposes within a PlayMotion environment, homography refers to the mapping of a sub-area of the camera image to the intended display area, or PlaySpace via the quadWarper filter. In addition to region mapping, a homography can also account for geometric warping due to a surface which is not entirely perpendicular to the optical axis of the projector-camera system. In this way, a relationship is established between each pixel of the camera input image (the real world) and one or more pixels of the final display output image (the virtual world).
Generally, your camera should be positioned so that it sees a larger area than is available to your projector or display device. Developers and designers can then use the Calibrate Utility to designate the specific sub-area of the camera image which is to be fed to the Vision Engine.
Infra-red (aka IR) is a class of light which exists just above the Visible Spectrum, with a wavelength of 750 nanometers (nm) and longer. For the practical purposes of PlayMotion, we focus on wavelengths between 750 and 790nm. Infrared is not visible to the naked human eye; however, most digital camera sensors can detect it. When looking at an infrared light source, it is common to see a gentle red glow... this is due to the fact that the emitters (typically LEDs) put out a small amount of light at the bottom of the visible spectrum. For a common example of infra-red lights, look at the front of any television or cable remote control, and hold down a button. The small red flicker you see coming out of the lens is an indicator of IR spectral activity.
PlayMotion professional applications sense on the infrared spectrum only. This has the advantage of removing interference from the sensed image, especially when the sensed area is coincident with the PlaySpace (with a standard projector / camera setup).
Generally, we like to make the PlaySpace as large as possible and do our best to have it meet the ground. Although this works well for groups, sometimes you want to design a game with intimacy in mind. The MagicWindow concept is PlayMotion's term for sizing down the projection area so that is only accessible to one or two people at a time. Additionally, a MagicWindow implementation emphasizes interaction of hands, whereas a standard PlaySpace puts more emphasis on full body interaction. In sizing down the projection area we seek to create an experience where the player feels as if they are peering through a window into a virtual world.
The MagicWindow concept works well with adults because, like TableTop it emphasizes reaching in with the hand as opposed to committing the whole body (adults are often sensitive to body image issues, and reticent to see their shadows at full size, especially when in front of large public audiences).
A player's natural shadow, also referred to as an optical shadow, is the shadow of the player which occludes the displayed image. In classical PlayMotion front projection experiences, this shadow serves as the players high-performance avatar. In many experience designs, this natural shadow occlusion can be of benefit; in other designs, such as 3D flying experiences, where players need to be able to see their intended destination, designers may opt for a rear projection setup using some form of virtual shadow or other avatar-like abstraction.
OpenCV (Open Source Computer Vision) is a library of programming functions mainly aimed at real-time computer vision application development. OpenCV is incoporated as part of the PlaySDK installation. Originally developed by Intel, it takes particular advantage of the Intel Performance Primitives (IPP) libraries.
The PlaySDK provides a friendly way to access many OpenCV functions using the Python programming language and XML. For more in depth instruction on OpenCV, you may wish to purchase the official primer, Learning OpenCV by O'Reilly.
An excellent place to start is our OpenCV Resources page.
Panda3D is a free 3D game engine: a library of subroutines for 3D rendering and game development. It was jointly developed by Carnegie Mellon University and Walt Disney Imagineering. Panda3D (aka Panda) has bindings to Python. Many of the Sample Applications that ship with the PlaySDK utilize the Panda libraries for rendering graphical and audio output.
For more information, visit the official website at panda3D.org.
PERP is a hardware specification which establishes the optimal and certified configuration for high performance PlayMotion based application deployment.
For starters, please reference our Minimum System Requirements
a person that plays; interactor; participant
In terms of PlayMotion Experiences, a player is any human who is present, aware, and interacting within the PlaySpace. Please note that interaction does not always imply movement. For instance, a detected Player can have the quality of stillness trigger a machine interaction (for example, see PlayMotion Trees: Seasons.
The PlayRemote is a handheld device capable of configuring all PlayMotion services and to adjust application variables during runtime. PlayRemotes and related driver software and configuration files are available from PlayMotion, Inc.
The PlaySDK is a set of documentation, software engines and sample code which enables experienced programmers, scripters, and smart designers to develop completely new and novel applications using computer vision, based on the PlayMotion Vision Engine (PMVE).
To get started creating, download the PlaySDK today!.
A PlaySpace is what we call the physcial area (or in 3space, volume) within which players interact with the PlayMotion Experience. In the PlayMotion Reference Environment, the PlaySpace is the area extending out from the projection surface.
For other environmental setups, such as TableTops, the PlaySpace would be defined as the active area atop the table surface.
In otherwords, the PlaySpace is defined as the physical world volume within which human movement is tracked as input for the application.
Ths PMCC is what we call the specialized computer with the PMVE and one or more PlayMotion-style experiences installed and running on it. A PMCC is a PlayMotion-certified version of the Optimal Hardware Requirements, and is embodied in our professional OEM hardware product, the PlayMotion CORE System.
The PlayMotion Vision Engine is the core engine which processes the incoming camera signal and parses it into intelligible data objects for use by your application. It includes the following services:
* camera input
* homography (geometry correction)
* noise reduction
* filter graph
* human detection
* hands and head detection
* edge normal
* moment service (principal axis detection)
* gesture detection (based on template banks full of input samples)
The PlayMotion Sensor Package is a unified optical assembly consisting of a high performance machine vision camera, tuned lens, integrated infrared filter, and mechancially aligned infra-red illuminators. Drivers for the PMSP and other high-end machine vision rigs come packaged with the PlayMotion SDK/pro. The PMSP is available for purchase from PlayMotion, Inc.
Poppet is the charming silhouette avatar which you can use to test out your interactive experiences on a small screen, perhaps the display of the very computer you're working on right now, without having to go through all the effort of getting up out of your chair to test in your actual PlaySpace. Poppet follows the movement of your mouse and raises his arms with a single mouse click. Brilliant!
Note: Poppet is activated when there is no camera attached to your system. Poppet is generated from a series of ordered bitmaps (0-9) located in a the \product\gizmo\config\poppet folder within your install directory. For testing different types of interactions, you may wish to experiment with different Poppet sets. Five different Poppet-sets are supplied for testing; feel free to make your own as well, and share them with others on the forums.
For more details, visit the Poppet Reference.
The PlayMotion Reference Environmental Specification is the canonical PlayMotion installation, establishing a physical relationship between camera, projector, player, and PlaySpace.
For more details, visit the PRES documentation.
A projector-camera system is a very effective pairing of two optical devices which has enjoyed increased popularity in Human-Computer Interaction (HCI) development in recent years. The basic premise behind such systems is that a projector (display / output) and a camera (sensor / input) can be co-located, thus producing a seamless, large scale, potentially portable, and relatively inexpensive (vs., say, a CAVE) interaction environment. Advantages include:
- coverage of extremely large areas of interaction (multiplayer)
- positioning of hardware in safe locations out of reach of players
- relative inexpensiveness vs. other display media of comparable scale / coverage.
- portability with recent advances in projection and camera technology.
Ideally in such a system, the lenses of both devices would be optically matched. Since this is most often not the case, for the purposes of PlaySDK development, the Calibrate Utility offers real-time warp and geometry corrections to mathematically correct for both physical offset and different lens characteristics between the projector and camera.
One primary advantage of a projector camera system in entertainment experiences is the utilization of a player's natural shadow as avatar. The natural shadow is an extremely high resolution, zero-latency representation of the player's place within the virtual universe.
A projector is a specific type of display which uses a high-powered light source, a matrix image generator, and an optical path typically comprised of mirrors and lenses designed to project the image on a screen typically anywhere from several feet to many hundred feet away. A new class of projectors called "pocket projectors" now make it possible to incorporate this technology into mobile platforms such as cellphones, radically challenging existing design considerations. The PlayMotion Reference Specification relies on a projector that is fixed in space.
Python is a high-level language that balances accessibility and power. Technically, it is a dynamic object-oriented programming language that can be used for many kinds of software development. It offers strong support for integration with other languages and tools, comes with extensive standard libraries, and can be learned in a few days
For more information, visit the official website at www.python.org.
For practical tips on working with Python, see our handy Tips for Fast Python Development.
Real-time refers to computer applications which perform all their processing on the fly, assimilating real world input (such as video data from a camera), and translating that into real world output (such as audio effects and video display) with no practically observable lag time between input and output cycles. For PlayMotion and everyday human purposes, total lag pipeline of less than 1/30th of a second is considered acceptable real-time performance. For high-performance applications, we strive to optimize the pipeline for less than 1/100th of a second total lag. This requires end-to-end pipeline optimization, including machine-vision class cameras, fast CPUs and GPUs, and simulation class data projectors.
Please note: a camera input cycle (measured in Hz, or input frames per second) and/or a GPU fps-counter of image output do not in and of themselves represent total latency. Total latency is computed by summing all latencies together. Thus a 60fps camera and a GPU rendering at 60fps will together amount to a minimum of 1/30th of a second lag, as this is still not factoring for CPU delay.
Rear projection designates a class of display where the projector sits behind the projection screen, generally in a protected area. This effectively transforms the projection surface into a very large emissive display. With rear projection, there is no way for a Player's optical shadow to interfere with the projected image (NOTE: there are effective artistic implementations where dancers and other trained body arts professionals are granted exclusive access to the area between projector and screen, generating a type of "shadow puppet" scenario).
It is important to note that while front projection systems only require a white wall, rear projection systems require special screens or fabric which allows the projected light to be effectively transmitted through the surface to the viewers.
Rear projection is generally suboptimal (designers lose the elegance of a natural shadow avatar), but can be used to wonderful effect (and is actually preferred) with certain styles of interaction. For instance, with human-driven flight simulators, rear projection with a semi-transparent VirtualShadow is preferred, so that the Player can still see where they are going without having their natural shadow occlude the generated imagery.
Segmentation refers to the one-bit signal which differentiates the foreground pixels from the background pixels in a frame of video. Foreground pixels are designated as positive (or 255, white), and background pixels are designated as negative (0, or black). Segmentation is the cleanest signal possible after exposure, homography, noise reduction, and threshholding have been computed on the raw camera signal. These variables are all optimally tuned within the Calibrate Utility.
Once a segmented signal is acquired, higher level data abstractions can be modelled, including blob tracking, head / hands detection, etc.
A sensor is a device which translates real-world activity into digital signals that are available for manipulation and interpretation by a machine and/or software. As the six senses transmit environmental stimulii to the human brain, so do sensors transmit environmental activity to machines. Examples of sensors include:
The term Sensor Array is used by PlayMotion to designate a combination of camera, filter, lens, and illumination sources. For instance, in the PMSP, the sensor array is comprised of:
a) a high speed machine vision camera (the sensor)
b) with infra-red filter
c) and a lens matched to the projector optics field of view
d) with a high power infra-red illuminator on either side of the camera lens.
In the most rudimentary instantiation, the sensor array is simply a store-bought webcam using available environmental lighting (overheads, lamps, etc.).
A sensor pixel is a a single pixel of the sensed image. Some PlayMotion Vision services are based on unit dimensions and areas as measured in sensor pixels. There is a specific relationship between input sensor pixels (visionspace), worldspace, the physical PlaySpace, and output display resolution that should be kept in mind when designing your interactives; these are dependent upon your selection of camera, game engine, environment and display, respectively.
For technical clarity, read Vision Space vs. World Space: context.
The vision services sit at the heart of the PlayMotion Vision Engine. They are a series of user-configurable, high-performance algorithms designed to supply your application with a real-time stream of high level vision data. Each service has its own configuration and visualization module within the Calibrate Utility.
For in-depth information, read Configuring the Vision Services.
TwinCam is a feature available with PlaySDK/pro which allows 2 or more cameras to be attached to a single PMCC... the input signal from multiple cameras is then "stitched" in real time in order to synthesize a larger field of view. Used in combination with the DoubleWide feature, a single PlayMotion System can generate an experience at twice the width of the reference PlaySpace.
Generally, we recommend making the PlaySpace as large as possible, and have it meet the ground. With TableTop, we project downward onto a table surface. Like the MagicWindow, a TableTop experience emphasizes player's hands (manual dexterity) over their bodies (body awareness).
The visible spectrum is classically defined as the region of the electromagnetic spectrum which emits light visible to the human eye. While we see this as "white light" generally, it is actually composed of more or less equal contributions across the spectrum, ROYGBIV (red orange yellow green blue indigo violet), as evidenced by the use of a prism and sunlight.
The PlayMotion Reference Specification uses a camera tuned to the infra-red frequencies of light, which reside just above the visible spectrum, thus eliminating any visible interference caused by the display or projector in a projector-camera system.
The vision space (sometimes referred to informally as image space or segmentation space) is a 2D integer lattice space (only integer values allowed) with origin in the upper left-hand corner. Right is the positive x-direction and down is the positive y-direction. Typically the dimension of this space is in sensor pixels, and the extents match the resolution of the image generator's (e.g. camera's) output signal, for instance 320x240.
Also see: Functions to help convert between vision space and world space.
VirtualShadow™ is a feature developed specifically to accommodate running PlayMotion in rear projection environments. Instead of using the natural shadow, VirtualShadow synthesizes a shadow using the segmentation data interpreted from the sensor array. The end result is a rough approximation of a natural shadow. This gives the player a good frame of reference for their interaction.
The world space is a floating-point plane (decimals allowed) with origin in the center. Right is the positive x-direction, up is the positive y-direction. The dimension of the space is user specifiable (default 2x2).
Also see: Functions to help convert between vision space and world space.
