Questions tagged [computer-vision]

For questions related to computer vision, which is an interdisciplinary scientific field (which can e.g. use image processing techniques) that deals with how computers can be made to gain high-level understanding from digital images or videos. For example, image recognition (that is, the identification of the type of objects in an image) is a computer vision problem.

For more info, see e.g. https://en.wikipedia.org/wiki/Computer_vision.

537 questions
4
votes
2 answers

Kalman filter pre inovation

I am trying to track LIDAR objects using Kalman filter. The problem is that the innovation has the value 0, which makes the Kalman gain be Infinity. Here is a link with the Kalman equations. The values with which I initialized the measurement and…
3
votes
1 answer

Before GAN, what are the commonly used techniques for image-to-image translation?

As per a post, image-to-image translation is a type of CV problem. I guess I understand the concept of image-to-image translation. I am aware that GANs(generative adversarial networks) are good at this kind of problems. I just wondered what the…
WXJ96163
  • 185
  • 1
  • 7
2
votes
1 answer

How to quantify the reflectance in an image?

I am working on a problem where I have to train a CNN to recognize different kinds of surfaces. One important characteristic of the surfaces I am interested is is how reflective they are. I have been trying to find a method that quantifies how…
jdowner
  • 121
  • 3
2
votes
1 answer

Structured lighting basic principles for depth mapping

I've been wondering, how, in the most simple-to-implement basic principle, does the light projection to depth map technique described here https://www.lightform.com/how-it-works actually functions? Is it some kind of an average based on the color of…
Rando Hinn
  • 123
  • 4
2
votes
2 answers

Is it possible to count the number of squats with Computer Vision techniques?

I am planning to build an app which will count the number of sqauts from videos. Assuming that the user and camera do not move, are there ways I can count the number of squats? Do such models to understand human activity and pose exist?
vc_dim
  • 53
  • 6
2
votes
1 answer

does it make sense to consider the base/backbone network one of the multiscale feature map blocks in SSD?

I'm trying to understand Single Shot Multibox Detection following a book adopted at 500 universities from 70 countries The complete single shot multibox detection model consists of five blocks. The feature maps produced by each block are used for…
singularli
  • 73
  • 5
2
votes
0 answers

How do cognitive services work?

Currently big tech companies like Microsoft, Google, and Amazon (to name a few) offer cognitive services on their cloud platforms. With these services it is possible to identify faces, objects, texts, sounds, etc. Do you know how these services work…
b00h
  • 59
  • 4
2
votes
0 answers

Ghost camera or video overlays for example in sports

Secondary camera, ghost overlay, video merge... I do not know if what I mean has a more specific name. I wonder if this is a thing. This could be insightful for example in racing sports where participants race one after another e.g. alpine skiing,…
Martin
  • 121
  • 2
2
votes
0 answers

How do I generate structured light for the 3D bin picking system?

I want to know how to generate the structured light which projects different patterns of light on a 3D object which is under scanning.
1
vote
1 answer

Can computer vision identify visual discrepancies in images?

I have an automated visual regression test harness for my web app. It takes baseline and change snapshots using puppeteer and compares them using pixelmatch. If I change the height of an element on the screen however, it pushes everything else on…
Robert W
  • 111
  • 1
1
vote
1 answer

Detecting individual multiple documents in a pdf

I need to solve a problem whereas a scan of multiple documents (contracts, invoices, bank extracts) is stored into a PDF and I need to identify how many individual documents are contained in the PDF and which pages of the PDF belong to which…
1
vote
1 answer

How to compare images for similarity?

I'm working on a system that reads 3 images per second and stores them in a collection. For each image I have all its keypoints and descriptor vectors, using ORB detector. On average there are 250 features in an image To add a new image to the…
1
vote
0 answers

Live video object detection with pose estimation

I was researching about hierarchical object detection, and end up reading that Yolo v3 is the state of art for that kind of tasks, besides, the inference time make it one of the best for run it on live video. So, what I have in mind, is to run a…
Mario Vega
  • 11
  • 1
0
votes
0 answers

in vgg16 ssd model, the feature maps from Conv5 and Conv7 are not directly used for prediction. Is my understanding correct?

the following ssd architecture comes from the original paper with 2 arrows i added. the layer pointed out by the red arrow labelled "Conv4_3", followed by Conv6, and then Conv7. Conv4, Conv7 and the layers after Conv7 are fed to the layer pointed…
JJJohn
  • 221
  • 2
  • 9
0
votes
2 answers

Where does 0.2023 for normalization on cifar10 come from?

i'm studying a resnet50 tutorial, which contains the following piece of code def create_dataset_cifar10(dataset_dir, usage, resize, batch_size, workers): data_set = ds.Cifar10Dataset(dataset_dir=dataset_dir, …
JJJohn
  • 221
  • 2
  • 9
1
2