During the development process, we often need to compare the similarity of image files in order to perform image recognition, deduplication and other operations. Generating a hash of an image is a common approach. Usually, we need to write the image to disk and then read it out for hash calculation. However, using the Golang programming language, we can easily generate a jpeg image while directly calculating a consistent hash value without writing to disk. This saves us time and disk space and increases efficiency. This article will detail how to implement this feature in Golang.
golang imaging newbie
I'm trying to generate consistent hashes for jpeg images. When I reload the image after writing it to disk as a JPEG (which is expected), loading the image and generating the hash on the raw bytes produces a different hash. Once I write the RBGA to disk as a JPEG, the pixels are modified, which corrupts the hash I calculated earlier.
Just hashing the file hash("abc.jpeg")
means I have to write to disk; read back; generate the hash, etc..
// Open the input image file inputFile, _ := os.Open("a.jpg") defer inputFile.Close() // Decode the input image inputImage, _, _ := image.Decode(inputFile) // Get the dimensions of the input image width := inputImage.Bounds().Dx() height := inputImage.Bounds().Dy() subWidth := width / 4 subHeight := height / 4 // Create a new image subImg := image.NewRGBA(image.Rect(0, 0, subWidth, subHeight)) draw.Draw(subImg, subImg.Bounds(), inputImage, image.Point{0, 0}, draw.Src) // id want the hashes to be the same for read / write but they will always differ hash1 := sha256.Sum256(imageToBytes(subImg)) fmt.Printf("<---OUT [%s] %x\n", filename, hash1) jpg, _ := os.Create("mytest.jpg") _ = jpeg.Encode(jpg, subImg, nil) jpg.Close() // upon reading it back in the pixels are ever so slightly diff f, _ := os.Open("mytest.jpg") img, _, _ := image.Decode(f) jpg_input := image.NewRGBA(img.Bounds()) draw.Draw(jpg_input, img.Bounds(), img, image.Point{0, 0}, draw.Src) hash2 := sha256.Sum256(imageToBytes(jpg_input)) fmt.Printf("--->IN [%s] %x\n", filename, hash2) // real world use case is.. // generate subtile of large image plus hash // if hash in a dbase // pixel walk to see if hash collision occurred // if pixels are different // deal with it... /// else // object.filename = dbaseb.filename // else // add filename to dbase with hash as the lookup // write to jpeg to disk
You can use a hash as the writer's target and use io.MultiWriter
to calculate the hash when writing to the file:
hash:=sha256.New() jpeg.Encode(io.MultiWriter(file,hash),img,nil) hashValue:=hash.Sum(nil)
The above is the detailed content of Golang generates consistent hashes for jpeg images without writing to disk. For more information, please follow other related articles on the PHP Chinese website!