Computer Vision is a subfield of computer science which deals with analyzing and understanding images. This includes detection of objects like faces in images or segmenting images.
Questions tagged [computer-vision]
634 questions
48
votes
3 answers
What does the notation mAP@[.5:.95] mean?
For detection, a common way to determine if one object proposal was right is Intersection over Union (IoU, IU). This takes the set $A$ of proposed object pixels and the set of true object pixels $B$ and calculates:
$$IoU(A, B) = \frac{A \cap B}{A…

Martin Thoma
- 18,880
- 35
- 95
- 169
10
votes
2 answers
How can I detect if an image was photoshopped?
I would like to check JPG files if they were manipulated to change the content.
What I consider NOT photoshopped:
Cropping
Rotating
(Scaling)
Image resolution
Automatic changes smartphones might make
What I consider photoshopping:
Adding a new…

Martin Thoma
- 18,880
- 35
- 95
- 169
4
votes
1 answer
What does 'energy' in image processing mean?
I have been going through this paper: Seam Carving for Content-Aware Image Resizing which talks about resizing images by seam carving depending on the image energy or the energy function.
Some related quotes from the paper are:
A seam is a…

Dawny33
- 8,296
- 12
- 48
- 104
2
votes
1 answer
Effective Methods for Background Removal on Images
I'm interested in learning about how background removal works on images taken of clothing items. Do we need a specific color difference between the background and the clothing item in order to be able to determine object edges? How are edges…

jonnyd42
- 121
- 1
2
votes
1 answer
Human Height Estimation using walking stride
Are there any papers or research showing a correlation between walking stride and human height? My purpose is to estimate height from walking stride of a person.

Janki Desai
- 121
- 2
1
vote
1 answer
How can I save my learning rate on each finished epoch using Callbacks?
I used LearningRateScheduler for my model training. I want to save learning rates on each epoch in CSV file (or other document files).
Is there any way to save those learning rates using callbacks?

AIFahim
- 273
- 1
- 3
- 15
1
vote
0 answers
A good way to use facial landmark as model input
We are planning to use facial landmark information as input to the model. Since there are more than 60 points, it doesn't look good to use 60 channels as inputs after one-hot encoding. I found a few papers with similar ideas, but I didn't like…

Hyelin
- 11
- 1
1
vote
0 answers
Detectron2 Alteranatives
Detectron2 is really good and it supports large number of models / but not all. Eg. Yolo.
Do we have alternative to detectron2 which provide easy to use API for both model inference and retraining? just like huggingface transformers for NLP.
Base…

duck
- 111
- 2
1
vote
1 answer
Multi Object Detection
Might sound like a very basic question, but in a detection problem with multiple objects (solved by regression) how does the weight update work such that regression equations of all classes are satisfied all at once. How is it that boxes of other…

Priya Mehta
- 11
- 2
1
vote
0 answers
Blip2 for image metada
I want to detect attributes of objects in an image - like what is color of a patch on shirt of person, how many patches are there, type of objects, exact dimensions of the objects etc
I've heard of Blip2 but I'm not sure if this will do what I need…

Sand T
- 11
- 3
1
vote
2 answers
How to train the background removal (rembg) model our images
does anyone know, how to train the rembg model with our own images and save it into Pickel file
This is rembg model:
https://github.com/danielgatis/rembg
please help me, How to train the above model.

user128610
- 21
- 1
- 3
1
vote
0 answers
Separate handwritten text from typed text in images
I have scanned documents which contains both the typed text in english and then some handwritten text including dates, signature or other text. Can someone help pointing resources which (preferably in python) which detects, or separates these two…

Sandeep Bhutani
- 894
- 1
- 7
- 24
0
votes
1 answer
Is there an existing best practice/tool for flattening images of documents taken from a camera?
I am researching a project which would include document scanning. What I want to achieve is something like the iOS document scanner in the notes app.
Here the input is a single image from a smartphone camera. The app is able to automatically guess…

sak
- 103
- 4
0
votes
1 answer
Open cv and computer vision
I'm new to computer vision, and I'm looking for a good place to start from, what's better between open cv in python or open cv in c++

ak3ra
- 11
- 2
0
votes
1 answer
Best framework for recognizing a specific cartoon character's face?
I have a supply of images of a specific cartoon character's face. I have hours of video. I would like to automatically find the sections of the video in which this cartoon character appears.
https://github.com/ageitgey/face_recognition doesn't seem…

Eli Rose
- 103
- 3