Mobile Internet industry in China has seen rapid development in the last decade, leading to a quick expansion and upgrade of IT infrastructure in all industries.
This article introduces major technical difficulties and challenges during the upgrade of video applications in all industries, proposes five key essentials and relevant reference standards for setting up the new generation of video cloud, and describes how intelligent video cloud help clients accelerate their upgrades with more convenient services at lower costs. In the future, application scenarios and fields of videos will probably become the core step when a company provides products or marketing its services. Also, its scalability perfectly matches the Matthew principle of the Internet. Thus, companies need to fully prepare for massive rich media materials in advance to avoid data disorder.
According to the survey by Markets&Markets, between2017 and 2022 the compound annual growth rate in global video surveillance market will reach 15.4%. Market size will grow to $75.6 trillion in 2022. Video surveillance can be widely applied to various scenarios, including:
In the past two years, greater demands were being placed on the surveillance side by public institutions such as kindergartens and schools, such as:
As for the surveillance on road traffic and urban safety, in addition to the traditional surveillance approach on vehicles’ violation of laws and regulations, pedestrians’ violation is gradually being added to the surveillance system, such as:
Above all, the video surveillance industry is now facing an upgrade. Big challenges lie in how people can stably access the video surveillance through the public network, and how massive images and videos generated can be well preserved, analyzed and retrieved.
Online education has become unprecedentedly popular in recent years. Audio and video technologies of the Internet solve the problem of time and space in transmitting high-quality educational resources. It shows in the forms as follows:
To solve the problem of lagging during live broadcasting, and further reduce delays in video interactions to improve teachers’ and students’ experience has now become a critical issue. With the development of Artificial Intelligence technology, more attention has been paid to how AI technologies will take video technologies further in online education, such as:
To provide an attractive and interactive live experience for their audience, media platforms of all levels should not only strive on contents but also pay attention to the demonstration and ways of interactions. Obviously, the traditional broadcasting scheme has certain limits:
Faced with above limits, the broadcasting industry urgently needs brand new video systems to provide customers with video entertainment experience at high image quality, vivid interactions and measurable, precise data management.
Difficulties need urgent solutions for new media broadcasting such as how to switch the contents of broadcast directing in a real-time way, how to ensure the real-time transmission of media contents, how to maximize the advertisement values of media and how to produce programs at low costs with high quality.
Since 1st July 2016, all public hearings of the Supreme People's Court have been broadcast live online and all live videos have been stored for the public to watch online. By March2018, the live broadcasting of hearings across the country reached up to over 660 thousand and around 5 billion person-time accessed the live broadcasting. Intelligent courts make full use of advanced information technologies, like the Internet, big data, cloud computing and AI, to support online services in all areas, legitimate publicity in full process and intelligent services in all angles:
Based on videos and documents and combined with AI computer vision technology, people can read and analyze the electronic files to grab important elements and sort them by tagging. For example, people can tag files about criminal motive, time and tool with different colors and thus compare and cross-analyze.
In terms of intelligent courts, more demanding requests and challenges are put forward on the reliability of video infrastructures, such as how to ensure the real-time transmission of the live videos of public hearings, how to store massive live videos for VOD and replaying, and how to conduct intelligent analysis based on extensive video contents.
Nowadays, medical resources are still distributed unevenly in different regions in China. Medical experts can conduct cross-regional interactive consultation through online live broadcasting and real-time audios and videos:
Since its appearance, telehealth has been widely applied. However, the remote medical business is now faced with certain major challenges, like how to improve the video transmission performance and how to ensure quick access to families, primary healthcare institutions, and urgent outdoor situations.
During the upgrade, the above industries are all confronted with enormous challenges from both technology and resource aspect while few companies are able to establish effective relevant video services in a short time. Hence knowing how to choose from and make use of relevant video services on public clouds to quickly fulfill business upgrading goals appears to be critical.
To satisfy the needs and challenges of all industries in the era of videos, intelligent video clouds should have the following 5 essentials:
Essential One: Stable Network Transmission and Dispatch: Smooth Watching Experience and Interactions with Low Delays
Essential Two: Extensible and Massive Storage Services: Data Security with High Reliability and Easy Extensibility
Essential Three: Editing and Processing of Media in Clouds: Fast and Multi-functional Video Editing in Clouds
Essential Four: Intelligent Analysis of Video Contents: Maximized Values behind Videos with AI
In the module of video structuration, basic elements and contents in videos are collected and sorted and the linear videos can be divided into components to be used separately. Knowledge graphs are used to put the information from the video structuration into order, such as events, figures, objects and scenarios and store and present it for easy retrieval and relevance. Based on the previous two, the module of big data retrieval offers high-efficient retrieval of massive media resources and contents. In terms of characteristics of figures, faces, images and videos and even more complex combined structure, video retrieval services are offered quickly.
Essential Five: Perfect Permission Control: Ban on Illegal Copy and Theft
From the above 5 essentials, Qiniu believes a full set of intelligent video cloud should compose the following modules:

Source: Qiniu
The intelligent video cloud of Qiniu can not only fully satisfy new needs from all industries in the era of videos in terms of techniques, but also largely save costs of research, development and operation for enterprises, compared with independent research and development.

Source: Qiniu
Faced with high costs, video cloud services offer abundant products as well as feature easy usage, flexibility and low costs of maintenance. Intelligent video clouds provide the universal technical system, which can also be made according to the specific business, and largely reduce the development cycle and costs of apps in all industries. The private or mixed deployment of modules in the video cloud ensures data security of enterprises and offers the same stability, reliability and flexibility as the public cloud.
AI, especially the deep application of computer vision technology, plays an enormous role in creating the technical and cost advantage of intelligent video clouds. In Qiniu’s intelligent video cloud system, computer vision technology replaces manual operation in many segments and largely increases the processing efficiency of video contents. Different from traditional data analysis, Qiniu’s intelligent video cloud system turns the previously unimaginable application in data analysis a reality.
As the most essential techniques in the basic model layer of computer vision, facial recognition, object recognition and scenario recognition have been widely applied into many scenarios in fields such as security protection, broadcasting and education.
For example, in a security protection scenario, HD cameras with facial recognition and motion tracking can judge human’s behavior according to the motions in the monitoring range and will automatically call the police if the person is suspected. When the intelligent cameras is connected to the fugitive database of the police, it can help the police recognize suspects in crowded places, like airports and railway stations, and largely increase the efficiency of solving cases and arresting criminals.
Compared with manual tagging, video structuration of computer vision has a series of obvious advantages, such as wide recognition range, high accuracy, continuous iteration of study models, high efficiency of GPU machines and low cost. Tagged videos can play huge roles in industries like telehealth, online education and broadcasting.
For example, in telehealth, the number of videos and images are far beyond the capability of manual tagging. To make the best use of medical videos, people need to sort the videos and images according to the different categories. AI can accurately sort videos with high efficiency and people can search the key information in videos in the same way as text files to make better use, which truly turns medical big data into medical knowledge graphs.
Images and videos have replaced texts to be the new mainstream way for communication, so it turns more and more critical to audit the images and videos. However, manual auditing not only leads to high labor costs for enterprises but is also difficult to satisfy the current auditing needs from enormous videos due to its efficiency and accuracy.
For example, in broadcasting industry, manual auditing are widely used for pornography, violence, terrorism and political figures in videos. With computer vision technology, machines can replace human inmost contents auditing situations, which greatly enhances the auditing efficiency. With computer vision technology and the revolution of auditing efficiency it has brought, auditing pornography, violence, terrorism and political figures will be no longer a problem in broadcasting industry.
Besides high-efficient video structuration and contents auditing, computer vision technology can also serve as the innovative engine for contents operation and suit more individualized demand for products.
For example, after finishing the structuration of video contents, operators can intelligently recommend contents according to users’ watching records and even target users with ads in a specific period and place in the video, to maximize the ad conversion. Intelligent recommendation of video contents can help contents operator conduct high-level and lean user management with the highest efficiency.
In future, few enterprises can exist isolated from the Internet, the total amount of data in enterprises will grow ceaselessly. Value of data will increase along with its burden. All enterprises need to possess the flexibility of usage and storage of files and rich media materials (including massive images, videos, and audios). However, few enterprises feel necessary to own the capability and resources to build video clouds for themselves. What is needed for most enterprises is a set of stable, upgradable video platforms, with which they will be able to cope with the ever-changing and escalating challenges in the future.
Source: Qiniu
