Getting started with Augmented Reality (iOS)

What is Augmented Reality?
Augment means “to increase the size or value of something by adding something to it”
Therefore, augmented reality can be understood as “an enhanced version of reality” created by the use of technology to overlay digital information on an image of something being viewed through a device (such as a smartphone camera)

With iOS 11, Apple has released many exciting features for developers. The most interesting of them are ARKit, Machine Learning and Vision Frameworks.

ARKit is a toolkit for developers that allows apps to have a native augmented reality framework. It allows apps to use computer vision for object recognition, and virtual objects can be placed on the display that are context-aware.

Similarly, CoreML is a framework that makes it really easy for developers to integrate machine learning models into their apps. And the iOS 11 Vision framework uses can range from text, barcode, face, and landmark detection to object tracking and image registration.

You can create some awesome apps using one or combination of these frameworks. Some common scenarios are:

1. Face Detection – This can be achieved using Vision framework. You can detect faces and the facial features such as eyes, nose, ears and mouth using Vision framework. You can detect faces in still images as well as in live camera feed by combining it with AVFoundation.

While looking for AR related examples, I came across with such an awesome example using Vision framework for face landmarks detection on live camera in iOS 11. Here is the link to the same:

Also, you can use the latest iPhone X’s TrueDepth camera with ARKit to
place and animate 3D content that follows the user’s face and matches facial expressions. Here is a sample from Apple:

Google has also provided Mobile Vision SDK that can be used to achieve the same:

2. Placing virtual objects – SceneKit can be used to detect planes and place virtual objects in the real world. All you need is a device with an A9 or later processor (iPhone 6s or better, iPhone SE, any iPad Pro, or the 2017 iPad), iOS 11 and Xcode 9 or higher.

The supported formats for virtual objects that you can add to the plane include COLLADA (COLLAborative Design Activity), identified with a .dae extension (digital asset exchange), and Alembic, identified with a .abc extension.
After adding this .dae object to your Xcode project, you need to convert it to SceneKit file format(.scn). This can be done in Xcode by choosing Editor → Convert to SceneKit scene file format (.scn)
However, most of the time, the conversion is not necessary. According to the SCNSceneSource documentation, when you include a model in DAE or Alembic format in your project, Xcode automatically converts the file to SceneKit’s scene format for use in the app, retaining its original .dae or .abc extension.

There is a good example code for tracking planes and placing furniture and decorative virtual objects at any place. Here is the link to the same:

3. Object Recognition – How to create an app that recognises the object that is placed in front of camera? This can be done by combining ARKit(SceneKit) and Vision framework and MLModel (Core ML) to tell the name of object placed in front of camera. I looked into multiple tutorials on implementation of object recognition and the one I liked most is :


With this basic understanding of Augmented Reality and the links to different type of examples, think of some innovative ideas and start implementing interesting apps by combining all these awesome frameworks.


Written By: Neha Gupta