Resources for Computer Vision Professionals

Foundations of Computer Vision: Understand the core principles of computer vision and gain insights into how these systems work.

Market Projections: Gain insight into the anticipated growth of the computer vision market, set to exceed USD $20.88 billion by 2030, with impacts on key domains such as transportation, healthcare, security, entertainment, and agriculture.

Opportunities in Research and Development: Learn about the increasing demand for research and development in the expanding landscape of computer vision, and discover the rising job opportunities within this dynamic field.

Industry Impact and Challenges: Uncover the transformative effects of computer vision across various sectors, while acknowledging the existing limitations and barriers that require attention.

Ethical Considerations: Examine the ethical concerns of computer vision, including the pressing need for transparency, fairness, accountability, privacy, and the adoption of best practices to ensure responsible deployment.

What is Computer Vision?

“‘Intelligent’ computers require knowledge of their environment, and the most effective means of acquiring such knowledge is by seeing. Vision opens a new realm of computer applications,” Computer magazine, May 1973.

Grounded in the principles of artificial intelligence (AI), computer vision provides machines the capability to perceive and analyze visual data such as images, graphics, and videos. The intention is similar to AI — to automate decisions — yet its area of focus is exclusive to activities a human’s visual system would generally conduct. IBM describes the contrast lucidly: “If AI enables computers to think, computer vision enables them to see, observe, and understand.”

The Fundamentals of Computer Vision

Computer vision, which seems like a modern innovation, is the outcome of extensive research stretching back to the 1960s. First coming into discovery with Seymour Papert’s Summer Vision Project of 1966, computer vision has been in development for decades, improving all along the way and creating new possibilities for everyone. Though complex, the process of these systems can be broken down into four fundamental steps:

Visual data such as images or video is taken into the computer vision systems as input. Since images are made up of pixels, these machines process information at the pixel level.
To analyze the data, distinctive features in the image, such as contours, corners, or colors, are identified using algorithms and models.
Through the process of identification, the computer recognizes objects such as people, as well as certain behaviors in the visuals. With the powers of machine learning, the computer can improve this ability over time.
Finally, the computer can provide an output based on this interpretation. To be put simply, this is when the computer communicates what it’s seeing.

Before the technology of computer vision came to today’s application methods, there were of course key pioneers that led the way first. For example, the Optical Character Recognition system was developed by Ray Kurzweil of Kurzweil Computer Products, Inc. in 1974. This system could recognize and process printed text, no matter the font and without manual entry. When placed in a machine learning format and enhanced with text-to-speech features, the technology was used to read for the blind.

This is just one pivotal example of the many applications that display the power and impact of computer vision. Thanks to waves of developments and crucial research, the technology has improved several domains of human life including transportation, healthcare, security, entertainment, and agriculture. Because of this, it is no surprise that the market of computer vision is expected to expand in the very near future.

Where Is Computer Vision Headed?

According to the Top Trends in Computer Vision Report, which reviews the latest trends covered at the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), the computer vision industry raked in over $12.14 billion USD in 2022 and has a 7% projected growth rate with $20.88 billion USD expected by 2030.

The revenue is projected to increase due to the surging need for the technology in various fields, like transportation, healthcare, and security. Moreover, according to PS Market Research, XR entertainment systems which were worth $38.3 billion in 2022 are predicted to reach an immense value of $394.8 billion by 2030.

Discover the Future of Computer Vision at IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)

Transportation & Aviation

The U.S. National Highway Traffic Safety Administration (NHTSA) has reported that 94% of critical collisions are caused by human error. With the help of computer vision, advanced cameras and sensors allow vehicles to analyze surroundings, detect objects such as pedestrians and other vehicles, and safely navigate around them. Furthermore, the technology is also used within the aviation sector to create flight simulators. Within these sectors, Extended Reality (XR) is also used to simulate flight training while reducing costs, time, and possible damages to aircraft.

More Resources:

Learn more about computer vision and automated vehicles by taking the IEEE course on ‘Using Machine Vision Perception to Control Automated Vehicle Maneuvering’

Healthcare

Computer vision is also the technology to thank for an improved patient experience within the healthcare system. This includes medical treatments and procedures. Specifically, computer vision has transformed the capabilities of medical imaging data, which allows practitioners to diagnose, monitor, or treat medical conditions. The technology also permits augmented reality (AR)-assisted surgical guidance, which can visualize human anatomy and aid practitioners when performing operations such as neurosurgical procedures.

More Resources:

Security & Privacy

Driven by progress made within machine learning, edge computing, IoT, and AI, computer vision enables the capability to mitigate security threats in real time. For example, with the help of image processing and statistical pattern recognition, biometrics allow computers to recognize persons based on physiological characteristics, such as faces or fingerprints. Additionally, computer vision aids security within smart security surveillance. This includes cameras that are placed in different areas within a city that monitor and detect threatening behavior. Attracting more attention is privacy-preserving biometrics as it may be used to resolve concerns related to cryptographic authentication processes.

More Resources:

Entertainment

Extended reality (XR) encompasses three categories: augmented reality (AR), mixed reality (MR), and virtual reality (VR). Each of these areas feed into the ever-growing demand for immersive experiences. Though mentioned previously for non-commercial use, such as flight training, XR is expanding and transforming the entertainment industry. According to Built In, a few of the top companies include Oculus, Microsoft, and Samsung.XR gaming blurs the line between virtual and physical realities, simulating new worlds and adventures for players to be fully immersed within. According to XR Today, the technology has provided the capability to transform social gatherings by giving its users the ability to create virtual events and exhibitions anywhere at any time.

More Resources:

Learn More About Virtual Reality and its Applications at IEEE VR 2024

Agriculture

According to researchers, insects affect 35% of farmland. Understanding and monitoring how insects play a role in agriculture is vital for food production, however, can be very labor-intensive and may even be unreliable at times. Computer vision can potentially improve this process by monitoring it automatically. On top of that, computer vision offers the opportunity to give automated machine systems ‘eyes’, enabling them to navigate fields, without manual labor.

More Resources:

Learn More About Virtual Reality and its Applications at IEEE VR 2024

Career Opportunities

According to the US Bureau of Labor Statistics, the employment of professionals in the computer and information science industry is expected to increase significantly over the next decade, reaching a 21% rise by 2031. To fill these new roles, experts in computer vision, extended reality (XR), and data visualization will be needed.

Computer Vision Engineers

Computer vision engineers work in highly collaborative environments, usually guided by the needs of their clients. In addition to building architectures and using algorithms, their typical areas of expertise include image classification, face detection, pose estimation, and optical flow. Within this field, time is mainly spent developing models, retraining them, and creating reliable datasets.

Skills: Developing image analysis algorithms, deep learning architectures, image processing and visualization, computer vision libraries, and data flow programming Salary: $160K USD (This is a salary estimation for United States employees according to talent.com. View estimates for other countries via Salary Expert.)
Degree: Bachelor’s in mathematics, computer vision, computer science, machine learning, information systems
Networking Opportunities:

XR Design/Graphics Engineers

Those within the XR industry, such as XR Design/Graphics Engineers, use their knowledge of computer vision to bring creative projects to life. Furthermore, they research and develop technology that augments reality, re-creates real-life environments, or generates other spaces that users can interact with virtually. Working cross functionally with creative teams, they use their knowledge within computer vision to help aid the design, optimization, integration, and testing of XR devices and products such as video games and other entertainment systems.

Skills: 3D visualization tools/art, coding languages such as python, C/C++ programming, and/or Java, Linear algebra, multimedia software stacks and frameworks
Salary: $107,000 USD (This is a salary estimation for United States employees according to circuitstream.com. View estimates for other countries via Salary Expert.)
Degree: Bachelor’s in Computer engineering, mathematics, or related fields of study. Master’s in Human Centered Design and Engineering or Interaction Design
Networking Opportunities:
- IEEE Virtual Reality 2024
- Technical Community on Intelligent Informatics

Data Visualization Engineers

The power of visualizing data helps decision makers to recognize and address patterns and mistakes in their information, allowing them to make educated choices for their organization. Data visualization engineers create visual representations of data, then build dashboards for different business departments to inspect. They play a pivotal role in the process of informed decision-making.

Skills: Business Intelligence (BI) tools, Data analysis, python-based visualizations, Data Visualization Tools such as Tableau, Yellowfin, and Qlik Sense, and mathematics/statistics
Salary: $96,317 (This is a salary estimation for United States employees according to salary.com. View estimates for other countries via Salary Expert.)
Degree: bachelor’s degree in computer science, computer information systems, software engineering, or a closely related field. Master’s degree in Data Analytics or Visualization
Networking Opportunities:
- IEEE VIS: Visualization & Visual Analytics
- Technical Community on Visualization and Graphics

Challenges and limitations of Computer Vision Technology

While computer vision has made significant improvements, challenges still prevail, emphasizing the necessity for continuous research and development in the field. This includes concerns related to data quality and bias. It’s important to note that any technology created or managed by humans is susceptible to biases. To ensure accurate detections and optimal functionality, these systems must be developed with diversity in inputs.

Moreover, the question remains: Can a computer not only perceive but truly comprehend its observations? It is crucial to instill trust in these systems, ensuring they understand what they observe with minimal errors and increased adoption to be accurate.

Lastly, security and privacy stand as major considerations for any widely adapted technology. However, these aspects continue to be challenging with room for improvement. In the context of facial recognition, this issue becomes particularly pronounced and ongoing, necessitating scrutiny and improvement.

Ethics, Standards, Diversity, and Inclusion

As the usage of computer vision technology progresses, ethics considerations have begun dominating the discussion. It’s crucial to examine specifics related to computer vision rather than depending on the general ethics linked to AI. These conversations are taking place during conferences, standards development and working groups, and research projects.

Ethics In Computer Vision

The IEEE / CVF Computer Vision and Pattern Recognition Conference (CVPR) aims to initiate further discussion within computer vision applications and research. In 2022, it was encouraged that researchers submit papers and proposals including potential negative societal impacts of their proposed research and possible methods on how to mitigate them. Potential ethical concerns include the safety of living beings, privacy, environmental impact, and economic security.

The organizers prioritized transparency and stated, “Grappling with ethics is a difficult problem for the field [computer vision], and thinking about ethics is still relatively new to many authors… In certain cases, it will not be possible to draw a bright line between ethical and unethical.”

The committee of IEEE/CVF CVPR 2023 planed to continue this conversation for the next annual conference and called for papers that focus on transparency, fairness, accountability, privacy, and ethics in vision.

Standards & Inclusion in XR

Specifically, in regard to ethics for XR, IEEE is laying down the foundation with standardization. As stated in IEEE Spectrum, “… the IEEE Standards Association (IEEE SA) is working to help define, develop, and deploy the technologies, applications, and governance practices needed to help turn metaverse concepts into practical realities, and to drive new markets.”

It’s also vital to keep in mind that this cutting-edge technology should be made accessible. For instance, it needs to accommodate people who are visually impaired. The study “Toward inclusivity: Virtual Reality Museums for the Visually Impaired” examines how narrations, spatialized “reference” audio, along with haptic feedback can be an effective replacement for the traditional use of vision in a virtual reality. The study discovered that those with visual impairments could locate objects more quickly with the aid of enhanced audio and tactile feedback.

Diversity in Visualization Research

Lastly, IEEE Transactions on Visualization and Computer Graphics (IEEE TVCG) conducted an analysis of gender representation among the attendees, organizers, and presenters at the IEEE Visualization (VIS) conference over the last 30 years. It was found that the proportion of female authors has increased from 9% in the first five years to 22% in the last five years of the conference.

The IEEE Computer Society urges academics and practitioners to send any ideas that may advance the dialogue to inclusion@computer.org since, it is efforts such as these, that have the potential to push the industry towards a brighter future.

Voices from the Community

IEEE Computer Society Fellow: Greg Welch

IEEE Computer Society Fellow and computer scientist engineer, Greg Welch, is the AdventHealth Endowed Chair in Healthcare Simulation in UCF’s College of Nursing in addition to being co-director of the UCF Synthetic Reality Laboratory. In 2021, Welch reached fellowship status, for contributions to tracking methods in augmented reality applications. Specifically, his primary area of study is virtual reality (VR) and augmented reality (AR), collectively known as “XR,” with a focus in both hardware and software applications.

Currently, Welch spends his time researching the way humans perceive AR related experiences when interacting with the technology. Additionally, he is the lead of the pending NSF project, “Virtual Experience Research Accelerator (VERA),” a system that will improve the process of generating VR related research for scientists.

When asked what advice Welch had for readers with an interest in pursuing a similar path, he mentioned how beneficial ongoing exploration can be, “The field changes fast — something that is hot today might not be tomorrow. In addition, a broader perspective can enable one to see connections and opportunities.”

He recommends taking advantage of community resources and networking opportunities, “From an experiential perspective, get involved! The community [IEEE Computer Society] would not exist without volunteers, but there are so many benefits — it really is true that you get out what you put in.”

Insights and Trends from CVPR

Computer vision remains a dynamic and evolving field. Technological advances introduce new opportunities and efficiencies, and they are met with challenges in the form of new theoretical and societal considerations.

From privacy and algorithmic fairness to the feasibility of wide-scale adoption, this is one of the most exciting eras in computer vision. The market is expected to reach US $20.88 billion by 2030, growing 7% annually.

Environmental Factors Shaping Computer Vision

Increase industry demand – Industries ranging from finance and healthcare to retail and security and beyond are exploring how computer vision supports their emerging needs. Such emphasis means research continues to focus on ways to access and manipulate data in strategic, efficient, and highly accurate new ways.
Data accessibility – The quality and integrity of data remains pivotal to results. Computer vision researchers are exploring how to achieve highly accurate results with smaller data sets, as well as with new techniques. In addition, more emphasis has been placed on opportunities with synthetic data to expand the use cases, availability, and address security issues around data sets.
Data privacy and bias – As computer vision techniques progress, how the data is used becomes a chief consideration. Advanced algorithms create unparalleled results, but personal security, bias, and societal factors come into play. Continued work will focus on the ethics surrounding these achievements.

Here are a few key observations, developments, and considerations for the field, informed by insights from IEEE Computer Vision and Pattern Recognition Conference (CVPR).

Blurred Lines between Computer Vision and Computer Graphics

“Half the papers in computer vision look like computer graphics. Instead of collecting data you can now simulate and that is very powerful.”

– Rama Chellappa, Johns Hopkins University

NeRF Research on the Rise

“NeRF research is a hot focus right now. It continues to generate jaw-dropping images and is a beautiful blend of computer graphics and computer vision. Computer vision scientists think of cameras as scientific measuring devices that can do more than capture visually pleasing 2D images. These algorithms are a continuation of that. The cameras will be designed to get better computational photography, unifying computer graphics, computational photograophy, and computer vision.”

– Kristin Dana, Rutgers University

Burgeoning Development of Content Generation

“Another trend is content generation: DALL-E can now generate images out of open AI. It makes some computational sense that we should be able to do it. When we think and have a text description, our brains generate an image even though we haven’t seen it, like when we read a book and generate an image in our heads. The algorithms are capturing that ability, and it’s remarkable. But with these content generation algorithms comes the potential for bias, and we have our work ahead of us in considering how they can and should be used.”

– Kristin Dana, Rutgers University

Re-Emergence of Classic Computer Vision

“The community is at a unique junction where while some papers focus on core technical research combining classical and modern deep networks, others focus on classical problems and innovative solutions.”

– Richa Singh, IIT Jodhpur

Synthetic Data

“There’s a tendency to move from real data to synthetic data if it is working, if it is effective. Cameras can only capture what has happened; whereas synthesis can imagine and produce whatever you wish. So, there is more variety in the synthetic data. And the privacy concerns are less.”

– Rama Chellappa, Johns Hopkins University

Dependable Facial Recognition Research

“The Computer Vision, Pattern Recognition, and Machine Learning community at large is focusing on developing ingenious algorithms not only for difficult scenarios, unconstrained environments, but also being trustworthy and dependable.”

– Richa Singh, IIT Jodhpur