Hash addressed content #3

Closed
opened 2018-06-09 17:56:05 +00:00 by jcgruenhage · 2 comments
Owner

To easily get git objects out of the media repository we either need a mapping of SHA1 to mxc://url or a way to get the object only using it's hash.

Instead of GET /_matrix/media/r0/download/{serverName}/{mediaId} we could do something like GET /_matrix/media/r0/download/{multihash}/{roomId}/{primaryServer}, where the last part would be optional.

The multihash thing would be https://github.com/multiformats/multihash

To easily get git objects out of the media repository we either need a mapping of SHA1 to mxc://url or a way to get the object only using it's hash. Instead of [`GET /_matrix/media/r0/download/{serverName}/{mediaId}`](https://matrix.org/docs/spec/client_server/r0.3.0.html#get-matrix-media-r0-download-servername-mediaid) we could do something like `GET /_matrix/media/r0/download/{multihash}/{roomId}/{primaryServer}`, where the last part would be optional. The multihash thing would be https://github.com/multiformats/multihash
jcgruenhage added the
Bikeshedding
label 2018-06-09 17:56:05 +00:00
Author
Owner

An alternative (that I like a lot less, but that might make Maximus happier) would be to not store the git objects as is, but instead put an mxc url right behind the SHA1's. That way, we neither need hash addressed content, nor some place to store the mapping, at least on the server side. The disadvantage is that if we mess up the objects slightly when reconstructing them, the hashes will not match and git will be very unhappy.

That also means that we can now encrypt the blobs, because we don't rely on the hashes to match up. People can't use a commit hash (sometimes included in version codes) to pull the complete source either anymore.

An alternative (that I like a lot less, but that might make Maximus happier) would be to not store the git objects as is, but instead put an mxc url right behind the SHA1's. That way, we neither need hash addressed content, nor some place to store the mapping, at least on the server side. The disadvantage is that if we mess up the objects slightly when reconstructing them, the hashes will not match and git will be very unhappy. That also means that we can now encrypt the blobs, because we don't rely on the hashes to match up. People can't use a commit hash (sometimes included in version codes) to pull the complete source either anymore.
Author
Owner

Not coming any time soon because hash addressed content might block future streaming usecases, where the file does not fully exist yet (so is not hash-able) when the download begins.

Not coming any time soon because hash addressed content might block future streaming usecases, where the file does not fully exist yet (so is not hash-able) when the download begins.
Sign in to join this conversation.
No labels
Bikeshedding
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference: jcgruenhage/git-on-matrix#3
No description provided.