Uma uke wavula ifoni yakho ngobuso bakho, waskena irisidi, noma wagqolozela ikhamera eziphumayo uzibuza ukuthi uma ngabe iyahlulela ukwatapheya wakho, usuke waphikisana nokubona ikhompuyutha. Kalula nje, I-Computer Vision ku-AI yindlela imishini efunda ngayo ukubona nokuqonda izithombe nevidiyo ngokwanele ukuze yenze izinqumo. Iwusizo? Nakanjani. Ngezinye izikhathi kuyamangaza? Futhi yebo. Futhi ngezinye izikhathi kuyamangaza uma sithembekile. Ngokungcono kakhulu, iguqula amaphikseli angcolile abe yizenzo ezingokoqobo. Okubi kakhulu, liyaqagela futhi liyanyakaza. Masimbe ngokufanele.
Izindatshana ongathanda ukuzifunda ngemva kwalesi:
🔗 Kuyini ukuchema kwe-AI
Kwenziwa kanjani ukuchema ezinhlelweni ze-AI nezindlela zokukuthola nokuyinciphisa.
🔗 Iyini i-AI ebikezelayo
Indlela i-AI yokubikezela esebenzisa ngayo idatha ukuze ilindele amathrendi nemiphumela.
🔗 Iyini
Isibopho somqeqeshi we-AI, amakhono, namathuluzi asetshenziswa ochwepheshe abaqeqesha i-AI.
🔗 Iyini i-Google Vertex AI
Uhlolojikelele lwenkundla ye-Google ehlanganisiwe ye-AI yokwakha nokuphakela amamodeli.
Iyini i-Computer Vision ku-AI, ngempela? 📸
I-Computer Vision ku-AI igatsha lobuhlakani bokwenziwa elifundisa amakhompuyutha ukuhumusha nokubonisana ngedatha ebonakalayo. Ipayipi elisuka kumaphikseli aluhlaza kuya kwencazelo ehlelekile: “lokhu uphawu lokuma,” “labo abahamba ngezinyawo,” “i-weld inesici,” “inani le-invoyisi selifikile.” Ihlanganisa imisebenzi efana nokuhlukanisa, ukutholwa, ukuhlukaniswa, ukulandelela, ukulinganisa ukujula, i-OCR, nokuhlanganisa okwengeziwe ngamamodeli okufunda iphethini. Inkambu esemthethweni ihlanganisa i-geometry yakudala iye ekufundeni okujulile kwesimanje, ngezincwadi zokudlala ezisebenzayo ongazikopisha futhi uzilungise. [1]
I-anecdote esheshayo: cabanga ngomugqa wokupakisha onekhamera ye-720p enesizotha. Umtshina we-lightweight spots caps, kanye ne-tracker elula iqinisekisa ukuthi aqondaniswe namafreyimu amahlanu alandelanayo ngaphambi kokukhanyisa ibhodlela ngokuluhlaza. Hhayi okumnandi-kodwa kushibhile, kuyashesha, futhi kunciphisa ukusebenza kabusha.
Yini eyenza i-Computer Vision ku-AI ibe wusizo? ✅
-
Ukugeleza kwesignali kuya esenzweni : Okokufaka okubonakalayo kuba okuphumayo okungasetshenzwa. Ideshibhodi encane, isinqumo esengeziwe.
-
Ukwenziwa Okujwayelekile : Ngedatha efanele, imodeli eyodwa iphatha izinhlobonhlobo zezithombe. Hhayi kahle - ngezinye izikhathi ngendlela emangalisayo.
-
Ukutholakala kwedatha : Amakhamera ashibhile futhi yonke indawo. Umbono uguqula lolo lwandle lwamaphikseli lube ukuqonda.
-
Isivinini : Amamodeli angacubungula ozimele ngesikhathi sangempela kuzingxenyekazi zekhompuyutha ezinesizotha-noma ngesikhathi esiseduze-sangempela, kuye ngomsebenzi nokulungiswa.
-
Composability : Hlanganisa izinyathelo ezilula zibe izinhlelo ezithembekile: ukuthola → ukulandelela → ukulawula ikhwalithi.
-
I-Ecosystem : Amathuluzi, amamodeli aqeqeshwe kusengaphambili, amabhentshimakhi, nokusekelwa komphakathi-i-bazaar eyodwa egcwele yekhodi.
Masikhulume iqiniso, i-sauce eyimfihlo ayiyona imfihlo: idatha enhle, ukuhlolwa okuqondisiwe, ukuthunyelwa ngokucophelela. Okunye ukuphrakthiza... futhi mhlawumbe nekhofi. ☕
kanjani iComputer Vision ku-AI , epayipini elilodwa eliphilile 🧪
-
Ukuthola isithombe
Amakhamera, izikena, ama-drones, amafoni. Khetha uhlobo lwenzwa, ukuchayeka, ilensi, nezinga lozimele ngokucophelela. Udoti phakathi, njll. -
Icubungula ngaphambili
Shintsha usayizi, nqampuna, yenza kube ngokwejwayelekile, susa ukufiphala noma khipha umsindo uma kudingeka. Kwesinye isikhathi i-tweak yokuqhathanisa encane inyakazisa izintaba. [4] -
Amalebula namasethi edatha
Amabhokisi ahlanganisayo, amapholigoni, amaphoyinti angukhiye, ububanzi bombhalo. Amalebula abhalansile, amele-noma imodeli yakho ifunda imikhuba eyehlayo. -
Ukumodela
-
Ukwahlukaniswa : “Isiphi isigaba?”
-
Ukutholwa : "Ziphi izinto?"
-
Isegmentation : "Imaphi amaphikseli ayingxenye yayiphi?"
-
Amaphuzu angukhiye nokuma : "Aphi amalunga noma izimpawu zendawo?"
-
OCR : "Yimuphi umbhalo osesithombeni?"
-
Ukujula & 3D : "Ikude kangakanani yonke into?"
Izakhiwo ziyahlukahluka, kodwa amanetha e-convolutional namamodeli esitayela se-transformer abusa. [1]
-
-
Ukuqeqeshwa kwedatha
ye-Split, shuna ama-hyperparameter, yenza ngokujwayelekile, engeza. Ukuma kusenesikhathi ngaphambi kokuthi ubambe ngekhanda isithombe sangemuva. -
Ukuhlola
Sebenzisa amamethrikhi afanele umsebenzi njenge-maP, IoU, F1, CER/WER ye-OCR. Ungakhethi-cherry. Qhathanisa kahle. [3] -
Ukuthunyelwa
Lungiselela okuqondiwe: imisebenzi yenqwaba yamafu, okucatshangwayo okukudivayisi, amaseva onqenqema. Gada ukukhukhuleka. Ziqeqeshe kabusha lapho umhlaba ushintsha.
Amanethi ajulile asungule ukweqa kwekhwalithi uma amasethi edatha amakhulu kanye nokubala kufinyelele isisindo esibucayi. Amabhentshimakhi afana nenselelo ye-ImageNet enze leyo nqubekelaphambili yabonakala-futhi yangapheli. [2]
Imisebenzi ebalulekile ozoyisebenzisa ngempela (futhi nini) 🧩
-
Ukuhlukaniswa kwesithombe : Ilebula elilodwa ngesithombe ngasinye. Sebenzisa izihlungi ezisheshayo, i-triage, noma amasango ekhwalithi.
-
Ukutholwa kwento : Amabhokisi azungeze izinto. Ukuvimbela ukulahlekelwa okuthengiswayo, ukutholwa kwemoto, ukubalwa kwezilwane zasendle.
-
Isegimenti yesimo : Amasilhouette anembile nge-Pixel ngento ngayinye. Amaphutha okukhiqiza, amathuluzi okuhlinza, i-agritech.
-
Ukuhlukaniswa kwe-Semantic : Ikilasi ngephikseli ngalinye ngaphandle kokuhlukanisa izimo. Izigcawu zemigwaqo yasemadolobheni, ikhava yomhlaba.
-
Ukutholwa kwephoyinti elingukhiye nokuma : Amalunga, izimpawu zendawo, izici zobuso. Izibalo zezemidlalo, i-ergonomics, i-AR.
-
Ukulandelela : Landela izinto ngokuhamba kwesikhathi. Logistics, ithrafikhi, ukuphepha.
-
I-OCR nedokhumenti ye-AI : Ukukhishwa kombhalo nokwahlukaniswa kwesakhiwo. Ama-invoyisi, amarisidi, amafomu.
-
Ukujula ne-3D : Ukwakhiwa kabusha kusuka ekubukeni okuningi noma izinkomba ze-monocular. Amarobhothi, i-AR, imephu.
-
Amazwibela abonakalayo : Fingqa izigcawu ngolimi lwemvelo. Ukufinyeleleka, ukusesha.
-
Amamodeli olimi lombono : Ukucabanga kwe-Multimodal, ukubona okungeziwe-ukubuyisa, i-QA esekelwe.
I-case vibe encane: ezitolo, umtshina ubeka ama-faces eshalofu angekho; i-tracker ivimbela ukubalwa kabili njengesitoko sabasebenzi; umthetho olula uqondisa ozimele abanokuzethemba okuphansi ekubuyekezweni komuntu. Kuyi-orchestra encane ehlala ivumelana.
Ithebula lokuqhathanisa: amathuluzi okuthumela ngokushesha 🧰
I-quirky kancane ngamabomu. Yebo, izikhala ziyinqaba - ngiyazi.
| Ithuluzi / Uhlaka | Kuhle kakhulu | Ilayisensi/Intengo | Kungani isebenza ngokusebenza |
|---|---|---|---|
| I-OpenCV | Ukucubungula ngaphambili, i-CV yakudala, ama-POC asheshayo | Mahhala - umthombo ovulekile | Ibhokisi lamathuluzi elikhulu, ama-API azinzile, ahlolwe impi; ngezinye izikhathi konke okudingayo. [4] |
| I-PyTorch | Ukuqeqeshwa okulungele ucwaningo | Mahhala | Amagrafu anamandla, i-ecosystem enkulu, okokufundisa okuningi. |
| I-TensorFlow/Keras | Ukukhiqizwa ngezinga | Mahhala | Izinketho zokupha abantu abadala, zilungele iselula kanye nomphetho futhi. |
| I-Ultralytics YOLO | Ukutholwa kwento esheshayo | Izengezo zamahhala + ezikhokhelwayo | Iluphu yokuqeqeshwa elula, ukunemba kwesivinini sokuncintisana, imibono kodwa ithokomele. |
| I-Detectron2 / MMDetection | Izisekelo eziqinile, ukuhlukaniswa | Mahhala | Amamodeli ebanga lesithenjwa anemiphumela ephindaphindekayo. |
| Isikhathi sokusebenza se-OpenVINO / ONNX | Ukuthuthukisa okucatshangwayo | Mahhala | Cindezela ukubambezeleka, sebenzisa kabanzi ngaphandle kokubhala kabusha. |
| I-Tesseract | I-OCR kubhajethi | Mahhala | Isebenza kahle uma uhlanza isithombe... ngezinye izikhathi kufanele ngempela. |
Yini eshayela ikhwalithi ku -Computer Vision ku-AI 🔧
-
Ukufakwa kwedatha : Izinguquko zokukhanya, ama-engeli, ingemuva, amakesi onqenqema. Uma kungenzeka, yifake.
-
Ikhwalithi yelebula : Amabhokisi angahambisani noma i-sloppy polygons sabotage maAP. I-QA encane ihamba ibanga elide.
-
Izandiso ezihlakaniphile : Nqampuna, zungezisa, ukukhanya kwe-jitter, engeza umsindo wokwenziwa. Yiba namaqiniso, hhayi isiphithiphithi esingahleliwe.
-
Ukulingana kokukhetha imodeli : Sebenzisa ukuthola lapho kudingeka khona ukutholwa-ungaphoqi isihlungi ukuqagela izindawo.
-
Amamethrikhi afana nomthelela : Uma ama-negative angamanga elimaza kakhulu, lungiselela ukukhumbula. Uma ukukhomba okungamanga kulimaza kakhulu, ukunemba kuqala.
-
Iluphu yempendulo eqinile : Ukwehluleka kwelogi, ilebula kabusha, qeqesha kabusha. Hlanza, phinda. Isebenza ngokuyisicefe kancane.
Ukuze uthole/uhlukanise, izinga lomphakathi Ukunemba Okumaphakathi okulinganiselwe kuyo yonke imikhawulo ye-IoU-aka COCO-style maP . Ukwazi ukuthi i-IoU ne-AP@{0.5:0.95} zenziwa kanjani ngekhompyutha kugcina izimangalo zebhodi yabaphambili zingakukhanyi ngamadesimali. [3]
Izimo zokusetshenziswa komhlaba wangempela ezingaqanjiwe 🌍
-
Ukuthengisa : Izibalo zeshalofu, ukuvimbela ukulahlekelwa, ukuqapha umugqa, ukuthobela i-planogram.
-
Ukukhiqiza : Ukutholwa kokukhubazeka kwendawo, ukuqinisekiswa komhlangano, ukuqondiswa kwerobhothi.
-
Ukunakekelwa kwezempilo : I-radiology triage, ukutholwa kwensimbi, ukuhlukaniswa kwamaseli.
-
Ukuhamba : I-ADAS, amakhamera omgwaqo, indawo yokupaka, ukulandelela i-micromobility.
-
Ezolimo : Ukubala izitshalo, ukubona izifo, ukulungela isivuno.
-
Umshwalense Nezezimali : Ukuhlolwa komonakalo, amasheke e-KYC, amafulegi okukhwabanisa.
-
Ukwakhiwa Namandla : Ukuthobela ukuphepha, ukutholwa kokuvuza, ukuqapha ukugqwala.
-
Okuqukethwe nokufinyeleleka : Amazwibela azenzakalelayo, ukulinganisela, ukusesha okubonakalayo.
Iphethini ozoyiqaphela: shintsha ukuskena okwenziwa ngesandla ngokunquma okuzenzakalelayo, bese udlulela kubantu lapho ukuzethemba kwehla. Ayibukhazikhazi-kodwa iyakala.
Idatha, amalebula, namamethrikhi abalulekile 📊
-
Ukuhlelwa : Ukunemba, F1 ukungalingani.
-
Ukutholwa : i-maP kuyo yonke imikhawulo ye-IoU; hlola i-AP yekilasi ngalinye namabhakede osayizi. [3]
-
Isegimenti : mIoU, Idayisi; hlola namaphutha ezinga lesibonelo futhi.
-
Ukulandelela : MOTA, IDF1; ikhwalithi yokuphinda ikhonjwe iqhawe elithule.
-
I-OCR : Izinga Lephutha Lomlingiswa (CER) kanye Nezinga Lephutha Legama (WER); ukwehluleka kwesakhiwo kuvame ukubusa.
-
Imisebenzi yokubuyisela emuva : Ukujula noma ukuma sebenzisa amaphutha aphelele/ahlobene (ngokuvamile esikalini sokungena).
Bhala iphrothokholi yakho yokuhlola ukuze abanye bakwazi ukuyiphindaphinda. Ayimnandi-kodwa ikugcina uthembekile.
Yakha uma iqhathaniswa nokuthenga nokuthi ungayiqhuba kuphi 🏗️
-
Ifu : Okulula kakhulu ukuliqala, lilungele inqwaba yomsebenzi. Buka izindleko zokuphuma.
-
Amadivayisi we-Edge : Ukubambezeleka okuphansi nobumfihlo obungcono. Uzokhathalela u-quantization, ukuthena, nama-accelerator.
-
Iselula ekudivayisi : Iyamangalisa uma ilingana. Lungiselela amamodeli nebhethri lewashi.
-
IHybrid : Isihlungi sangaphambili onqenqemeni, ukuphakamisa okusindayo emafini. Ukuvumelana okuhle.
Isitaki esithembeke ngendlela exakayo: i-prototype ene-PyTorch, qeqesha umtshina ojwayelekile, thumela ku-ONNX, usheshise nge-OpenVINO/ONNX Runtime, futhi usebenzise i-OpenCV ukuze ucubungule ngaphambili kanye nejometri (ukulinganisa, i-homography, i-morphology). [4]
Izingozi, izimiso zokuziphatha, kanye nezingxenye okunzima ukukhuluma ngazo ⚖️
Amasistimu ombono angathola ifa lokuchema kwesethi yedatha noma izindawo eziyimpumputhe zokusebenza. Ukuhlola okuzimele (isb., i-NIST FRVT) kulinganise ukuhluka kwezibalo zabantu kuzilinganiso zamaphutha okubona ubuso kuwo wonke ama-algorithms nezimo. Leso akusona isizathu sokwethuka, kodwa yisizathu sokuhlola ngokucophelela, ukulinganiselwa kwemibhalo, nokuqapha njalo ekukhiqizeni. Uma usebenzisa ubuwena- noma izimo zokusebenzisa ezihlobene nokuphepha, faka phakathi ukubuyekezwa komuntu kanye nezindlela zokukhalaza. Ubumfihlo, imvume, kanye nokubonisa ngale akukona okungeziwe ongakukhetha. [5]
Imephu yomgwaqo esheshayo ongayilandela 🗺️
-
Chaza isinqumo
Yisiphi isinyathelo okufanele sithathwe yisistimu ngemva kokubona isithombe? Lokhu kukugcina ekuthuthukiseni amamethrikhi ayize. -
Qoqa idathasethi ye-scrappy
Qala ngezithombe ezingamakhulu ambalwa ezibonisa indawo okuyo yangempela. Lebula ngokucophelela-ngisho noma uwena namanothi amathathu anamathelayo. -
Khetha imodeli yesisekelo
Khetha umgogodla olula onesisindo esiqeqeshwe kusengaphambili. Ungajahi izakhiwo ezingavamile okwamanje. [1] -
Isitimela, log, hlola
ama-metric e-Track, amaphuzu okudideka, namamodi okuhluleka. Gcina incwajana "yamacala ayinqaba" -iqhwa, ukuxhopha, ukuboniswa, amafonti angajwayelekile. -
Qinisa iluphu
Engeza ama-negative aqinile, lungisa ukukhukhuleka kwelebula, lungisa ama-augmentations, bese ushuna kabusha ama-threshold. Ama-tweaks amancane ayengeza. [3] -
Sebenzisa inguqulo encane
Qhathanisa futhi uthumele. Linganisa ukubambezeleka/okuphumayo endaweni yangempela, hhayi ibhentshimakhi yethoyizi. -
Gada futhi uphindaphinde
Qoqa ukungalungi, ilebula kabusha, qeqesha kabusha. Hlela ukuhlolwa ngezikhathi ezithile ukuze imodeli yakho ingashintshi.
Ithiphu ye-Pro: chaza i-holdout encane esethwe uzakwenu onokweyisa kakhulu. Uma bengakwazi ukubhoboza izimbobo kuwo, cishe usukulungele.
Izinzuzo ezivamile ozofuna ukuzigwema 🧨
-
Ukuqeqeshwa ezithombeni zesitudiyo ezihlanzekile, ezithunyelwa emhlabeni wangempela ngemvula kumalensi.
-
Ukulungiselela i-map iyonke lapho ukhathalela ngempela isigaba esisodwa esibucayi. [3]
-
Ukuziba ukungalingani kwekilasi bese uyazibuza ukuthi kungani izehlakalo ezingavamile zishabalala.
-
Ukwandisa ngokweqile kuze kube yilapho imodeli ifunda ama-artifact okwenziwa.
-
Ukweqa ukulinganisa kwekhamera bese ulwa namaphutha okubuka unomphela. [4]
-
Ukukholelwa izinombolo zebhodi yabaphambili ngaphandle kokuphindaphinda ukusetha kokuhlola okuqondile. [2][3]
Imithombo okufanele ibhukhimakhi 🔗
Uma uthanda izinto eziyisisekelo namanothi esifundo, lezi yigolide lezisekelo, ukuzijwayeza, kanye nezilinganiso. Bheka esithi Izithenjwa ukuze uthole izixhumanisi: Amanothi e-CS231n, iphepha lenselelo ye-ImageNet, isethi yedatha ye-COCO/amadokhumenti okuhlola, amadokhumenti e-OpenCV, kanye nemibiko ye-NIST FRVT. [1][2][3][4][5]
Amazwi okugcina - noma amade kakhulu, awazange afunde 🍃
I-Computer Vision ku-AI ishintsha amaphikseli abe izinqumo. Kuyakhanya uma ubhangqa umsebenzi olungile nedatha efanele, ukala izinto ezifanele, futhi uphindaphinda ngesiyalo esingajwayelekile. Ukusetshenziswa kwamathuluzi kunomusa, amabhentshimakhi asesidlangalaleni, futhi indlela esuka ku-prototype iye ekukhiqizeni imfushane ngokumangalisayo uma ugxila esinqumweni sokugcina. Qondisa amalebula akho, khetha amamethrikhi afana nomthelela, futhi uvumele amamodeli aphakamise kanzima. Futhi uma isingathekiso sisiza-cabanga ngaso njengokufundisa umfundi osheshayo kodwa ongokoqobo ukuze ubone okubalulekile. Ubonisa izibonelo, ulungise amaphutha, futhi kancane kancane uthembele ngomsebenzi wangempela. Ayiphelele, kodwa isondele ngokwanele ukuze iguqule. 🌟
Izithenjwa
-
I-CS231n: Ukufunda Okujulile Kombono Wekhompyutha (amanothi ezifundo) - Inyuvesi yaseStanford.
Funda kabanzi -
I-ImageNet Enkulu Yesikali Esibonakalayo Inselelo Yokuqashelwa Okubonakalayo (iphepha) - Russakovsky et al.
Funda kabanzi -
I-COCO Dataset & Evaluation - Isayithi elisemthethweni (izincazelo zomsebenzi kanye nezivumelwano ze-mAP/IoU).
Funda kabanzi -
I-OpenCV Documentation (v4.x) - Amamojula okucubungula ngaphambilini, ukulinganisa, i-morphology, njll.
funda kabanzi -
I-NIST FRVT Ingxenye 3: Imithelela Yezibalo zabantu (NISTIR 8280) - Ukuhlolwa okuzimele kokunemba kokubonwa kobuso kuzo zonke izinhlobo zabantu.
Funda kabanzi