This is the initial implementation for the new URL unfurling requirements. The
most important one is that only the message sender will pay the privacy cost for
unfurling and extracting metadata from websites. Once the message is sent, the
unfurled data will be stored at the protocol level and receivers will just
profit and happily decode the metadata to render it.
Further development of this URL unfurling capability will be mostly guided by
issues created on clients. For the moment in status-mobile:
https://github.com/status-im/status-mobile/labels/url-preview
- https://github.com/status-im/status-mobile/issues/15918
- https://github.com/status-im/status-mobile/issues/15917
- https://github.com/status-im/status-mobile/issues/15910
- https://github.com/status-im/status-mobile/issues/15909
- https://github.com/status-im/status-mobile/issues/15908
- https://github.com/status-im/status-mobile/issues/15906
- https://github.com/status-im/status-mobile/issues/15905
### Terminology
In the code, I've tried to stick to the word "unfurl URL" to really mean the
process of extracting metadata from a website, sort of lower level. I use "link
preview" to mean a higher level structure which is enriched by unfurled data.
"link preview" is also how designers refer to it.
### User flows
1. Carol needs to see link previews while typing in the chat input field. Notice
from the diagram nothing is persisted and that status-go endpoints are
essentially stateless.
```
#+begin_src plantuml :results verbatim
Client->>Server: Call wakuext_getTextURLs
Server-->>Client: Normalized URLs
Client->>Client: Render cached unfurled URLs
Client->>Server: Unfurl non-cached URLs.\nCall wakuext_unfurlURLs
Server->>Website: Fetch metadata
Website-->>Server: Metadata (thumbnail URL, title, etc)
Server->>Website: Fetch thumbnail
Server->>Website: Fetch favicon
Website-->>Server: Favicon bytes
Website-->>Server: Thumbnail bytes
Server->>Server: Decode & process images
Server-->>Client: Unfurled data (thumbnail data URI, etc)
#+end_src
```
```
,------. ,------. ,-------.
|Client| |Server| |Website|
`--+---' `--+---' `---+---'
| Call wakuext_getTextURLs | |
| ---------------------------------------> |
| | |
| Normalized URLs | |
| <- - - - - - - - - - - - - - - - - - - - |
| | |
|----. | |
| | Render cached unfurled URLs | |
|<---' | |
| | |
| Unfurl non-cached URLs. | |
| Call wakuext_unfurlURLs | |
| ---------------------------------------> |
| | |
| | Fetch metadata |
| | ------------------------------------>
| | |
| | Metadata (thumbnail URL, title, etc)|
| | <- - - - - - - - - - - - - - - - - -
| | |
| | Fetch thumbnail |
| | ------------------------------------>
| | |
| | Fetch favicon |
| | ------------------------------------>
| | |
| | Favicon bytes |
| | <- - - - - - - - - - - - - - - - - -
| | |
| | Thumbnail bytes |
| | <- - - - - - - - - - - - - - - - - -
| | |
| |----. |
| | | Decode & process images |
| |<---' |
| | |
| Unfurled data (thumbnail data URI, etc)| |
| <- - - - - - - - - - - - - - - - - - - - |
,--+---. ,--+---. ,---+---.
|Client| |Server| |Website|
`------' `------' `-------'
```
2. Carol sends the text message with link previews in the RPC request
wakuext_sendChatMessages. status-go assumes the link previews are good
because it can't and shouldn't attempt to re-unfurl them.
```
#+begin_src plantuml :results verbatim
Client->>Server: Call wakuext_sendChatMessages
Server->>Server: Transform link previews to\nbe proto-marshalled
Server->DB: Write link previews serialized as JSON
Server-->>Client: Updated message response
#+end_src
```
```
,------. ,------. ,--.
|Client| |Server| |DB|
`--+---' `--+---' `+-'
| Call wakuext_sendChatMessages| |
| -----------------------------> |
| | |
| |----. |
| | | Transform link previews to |
| |<---' be proto-marshalled |
| | |
| | |
| | Write link previews serialized as JSON|
| | -------------------------------------->
| | |
| Updated message response | |
| <- - - - - - - - - - - - - - - |
,--+---. ,--+---. ,+-.
|Client| |Server| |DB|
`------' `------' `--'
```
3. The message was sent over waku and persisted locally in Carol's device. She
should now see the link previews in the chat history. There can be many link
previews shared by other chat members, therefore it is important to serve the
assets via the media server to avoid overloading the ReactNative bridge with
lots of big JSON payloads containing base64 encoded data URIs (maybe this
concern is meaningless for desktop). When a client is rendering messages with
link previews, they will have the field linkPreviews, and the thumbnail URL
will point to the local media server.
```
#+begin_src plantuml :results verbatim
Client->>Server: GET /link-preview/thumbnail (media server)
Server->>DB: Read from user_messages.unfurled_links
Server->Server: Unmarshal JSON
Server-->>Client: HTTP Content-Type: image/jpeg/etc
#+end_src
```
```
,------. ,------. ,--.
|Client| |Server| |DB|
`--+---' `--+---' `+-'
| GET /link-preview/thumbnail (media server)| |
| ------------------------------------------> |
| | |
| | Read from user_messages.unfurled_links|
| | -------------------------------------->
| | |
| |----. |
| | | Unmarshal JSON |
| |<---' |
| | |
| HTTP Content-Type: image/jpeg/etc | |
| <- - - - - - - - - - - - - - - - - - - - - |
,--+---. ,--+---. ,+-.
|Client| |Server| |DB|
`------' `------' `--'
```
### Some limitations of the current implementation
The following points will become separate issues in status-go that I'll work on
over the next couple weeks. In no order of importance:
- Improve how multiple links are fetched; retries on failure and testing how
unfurling behaves around the timeout limits (deterministically, not by making
real HTTP calls as I did). https://github.com/status-im/status-go/issues/3498
- Unfurl favicons and store them in the protobuf too.
- For this PR, I added unfurling support only for websites with OpenGraph
https://ogp.me/ meta tags. Other unfurlers will be implemented on demand. The
next one will probably be for oEmbed https://oembed.com/, the protocol
supported by YouTube, for example.
- Resize and/or compress thumbnails (and favicons). Often times, thumbnails are
huge for the purposes of link previews. There is already support for
compressing JPEGs in status-go, but I prefer to work with compression in a
separate PR because I'd like to also solve the problem for PNGs (probably
convert them to JPEGs, plus compress them). This would be a safe choice for
thumbnails, favicons not so much because transparency is desirable.
- Editing messages is not yet supported.
- I haven't coded any artificial limit on the number of previews or on the size
of the thumbnail payload. This will be done in a separate issue. I have heard
the ideal solution may be to split messages into smaller chunks of ~125 KiB
because of libp2p, but that might be too complicated at this stage of the
product (?).
- Link preview deletion.
- For the moment, OpenGraph metadata is extracted by requesting data for the
English language (and fallback to whatever is available). In the future, we'll
want to unfurl by respecting the user's local device language. Some websites,
like GoDaddy, are already localized based on the device's IP, but many aren't.
- The website's description text should be limited by a certain number of
characters, especially because it's outside our control. Exactly how much has
not been decided yet, so it'll be done separately.
- URL normalization can be tricky, so I implemented only the basics to help with
caching. For example, the url https://status.im and HTTPS://status.im are
considered identical. Also, a URL is considered valid for unfurling if its TLD
exists according to publicsuffix.EffectiveTLDPlusOne. This was essential,
otherwise the default Go url.Parse approach would consider many invalid URLs
valid, and thus the server would waste resources trying to unfurl the
unfurleable.
### Other requirements
- If the message is edited, the link previews should reflect the edited text,
not the original one. This has been aligned with the design team as well.
- If the website's thumbnail or the favicon can't be fetched, just ignore them.
The only mandatory piece of metadata is the website's title and URL.
- Link previews in clients should be generated in near real-time, that is, as
the user types, previews are updated. In mobile this performs very well, and
it's what other clients like WhatsApp, Telegram, and Facebook do.
### Decisions
- While the user typing in the input field, the client is constantly (debounced)
asking status-go to parse the text and extract normalized URLs and then the
client checks if they're already in its in-memory cache. If they are, no RPC
call is made. I chose this approach to achieve the best possible performance
in mobile and avoid the whole RPC overhead, since the chat experience is
already not smooth enough. The mobile client uses URLs as cache keys in a
hashmap, i.e. if the key is present, it means the preview is readily available
(naive, but good enough for now). This decision also gave me more flexibility
to find the best UX at this stage of the feature.
- Due to the requirement that users should be able to see independent loading
indicators for each link preview, when status-go can't unfurl a URL, it
doesn't return it in the response.
- As an initial implementation, I added the BLOB column unfurled_links to the
user_messages table. The preview data is then serialized as JSON before being
stored in this column. I felt that creating a separate table and the related
code for this initial PR would be inconvenient. Is that reasonable to you?
Once things stabilize I can create a proper table if we want to avoid this
kind of solution with serialized columns.
* fix(mentions): deleting or editing a mention should remove the mention
* test(edit): add a test for mentions in edits
* test(delete): add test for deleting a message with a mention
This commit replaces `os.MkdirTemp` with `t.TempDir` in tests. The
directory created by `t.TempDir` is automatically removed when the test
and all its subtests complete.
Prior to this commit, temporary directory created using `os.MkdirTemp`
needs to be removed manually by calling `os.RemoveAll`, which is omitted
in some tests. The error handling boilerplate e.g.
defer func() {
if err := os.RemoveAll(dir); err != nil {
t.Fatal(err)
}
}
is also tedious, but `t.TempDir` handles this for us nicely.
Reference: https://pkg.go.dev/testing#T.TempDir
Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>
There were a couple of issues on how we handle pinned messages:
1) Clock of the message was only checked when saving, meaning that the
client would receive potentially updates that were not to be
processed.
2) We relied on the client to generate a notification for a pinned
message by sending a normal message through the wire. This PR changes
the behavior so that the notification is generated locally, either on
response to a network event or client event.
3) When deleting a message, we pull all the replies/pinned notifications
and send them over to the client so they know that those messages
needs updating.
A migration was added out-of-order, which meant that in clients who
had already run the migration after, it would be skip.
This commit re-adds the migration so it's run, tested against an empty
account and one that had already migrated.
Because `QuotedMessage` doesn't include imported message data,
some of the author information in imported messages is lost in
frontends.
This commit adds a `discordMessage` (soon replaced by `importedMessage`)
to `Quotedmessage`, although only hydrated with a subset of data,
namely author display name and avatar URL, as those are the only ones
needed by front-end atm.
This adds a new `DiscordMessageAttachment` type which is part of
`DiscordMessage`. Along with that type, there's also a new database
table for `discord_message_attachments` and corresponding persistence
APIs.
This commit also changes how chat messages are retrieved.
Here's why:
`DiscordMessage` can have multiple `DiscordMessageAttachment`.
A chat message can have a `DiscordMessage`.
Because we're `LEFT JOIN`'ing the discord message attachments into the
chat messages, there's a possibility of multiple rows per message.
Hence, this commit ensures we collect queried discord message
attachments on chat messages.
This adds a new `discord_messages` table and extends the persistence
APIs such that `MessagesByID` and `MessageByID` will return user
messages that include their discord message payload.
It also adds APIs to save individual discord messages.
* Added function to get preffered network IP
Also done some refactor work oon server package to make a lot more reusable
* Added server.Option and simplified handler funcs
* Added serial number deterministically generated from pk
* Debugging TLS server connection
* Implemented configurable server ip
When accessing over the network the server needs to listen on the network port and not localhost or 127.0.0.1 . Also the cert can now have a dedicated IP
* Refactor of URL funcs to use the url package
* Removed redundant Options pattern in favour of config param
* Added full server test using GetOutboundIP
* Remove references and usage of Server.port
The application does not need to set the port, we rely on the net.Listener to pick a port.
* Version bump
* Added ToECDSA func and improved cert testing
* Added error check in test
* Split Server types, embedding raw Server funcs into specialised server types
* localhost
* Implemented DNS and IP based cert gen
ios doesn't allow for restricted ip addresses to be used in a valid tls cert
* Replace listener handling with original port store
Also added handlers as a parameter of the Server
Sometimes the message scheduled & message sent notifications are
received out of order.
Now we use a single channel for both so order is maintained.
This was causing the tests to be flaky and it might have happened in
production as well.
* Adding wakunode module
* Adding wakuv2 fleet files
* Add waku fleets to update-fleet-config script
* Adding config items for waku v2
* Conditionally start waku v2 node depending on config
* Adapting common code to use go-waku
* Setting log level to info
* update dependencies
* update fleet config to use WakuNodes instead of BootNodes
* send and receive messages
* use hash returned when publishing a message
* add waku store protocol
* trigger signal after receiving store messages
* exclude linting rule SA1019 to check deprecated packages
* add PinMessage and PinnedMessage
* fix gruop pin messages
* add SkipGroupMessageWrap to pin messages
* update pinMessage ID generation to be symmetric
This commit introduces the following changes:
- `local-notifications` require as body an interface complying with
`json.Marshaler`
- removed unmarshaling of `Notifications` as not used (we only Marshal
notifications)
- `protocol/messenger.go` creates directly a `Notification` instead of
having an intermediate format
- add community notifications on request to join
- move parsing of text in status-go for notifications
This commit expands the confirmation mechanism to allow private group
chat messages to be confirmed:
Changes:
- Added a separate table for message confirmations as group chat
messages have same messageID but multiple datasyncID
- Removed DataSyncID from raw message (I haven't removed the column name
as it can't be done in sqlite without copying over the table)
* Revert "Revert "Expand Local Notifications to support multiple Notification types (#2100)""
This reverts commit 5887337b88.
* Revert "Revert "fix protocol.MessageNotificationBody marshalling""
This reverts commit cf0a16dff1.
* Bump version to 0.70.0
* Added localnotifications for Transaction messages
* Fixed bug where Message.SigPubKey was presumed to be set
* Added lookup for contact existing in Messenger.allContacts
Additionally added functionality to add a contact to the messenger store if it isn't present
* Get chat directly from Messenger.allChats store
Co-authored-by: Andrea Maria Piana <andrea.maria.piana@gmail.com>
There was a bug on status-react where it would save filters that were
not listened to.
This commit adds a task to clean up those filters as they might result
in long syncing times.
This commit also returns topics/ranges/mailserves from messenger in
order to make the initialization of the app simpler and start moving
logic to status-go.
It also removes whisper from vendor.
One of the issues we noticed is that the partitioned topic
in push notification is heavy in traffic, as any user using a particular
mailserver will use that partitioned topic to register for PNs.
This commit moves from the partitioned topic to the personal topic of
the PN server, so it does not clash with other users that might happen
to have the same partitioned topic as the mailserver, resulting in long
sync times.
Another issue that will need to be addressed separately is that once you
send a message to a topic, because of the way how waku/whisper works,
you will have to register to that topic, meaning that you will receive
that data. Currently waku does not support unsubscribing from a topic
without logging in and out, so that needs also to be addressed.
When sending a message in a private group chat we send the whole history
for redundancy and allow out-of-order processing.
This can be very expensive in some chats, resulting in long delay when
sending a message and calculating the POW.
This commit improves the performance by only forwarding the events
necessary for the user to be able to construct a group chat and process
correctly the message.
This commit fixes a couple of issues:
1) Emojis were sent to any member of the group chat, regardless of
whether they joined
2) We don't want to wrap emojis, as there's no need to do so, only
messages are to be wrapped
This commit fixes a bug on the mvds library where the nextEpoch would be
incorrectly summed to the retry time, resulting in messages not being
retried, or retried much less frequently the longer the app was running.
It also updates the retry timing to backoff exponentially at multiple of
30 seconds.
When sending messages in quick succession, it might be that multiple
messages are batched together in datasync, resulting in a single large
payload.
This commit changes the behavior so that we can pass a max-message-size
and we split the message in batches before sending.
A more elegant way would be to split at the transport layer (i.e
waku/whisper), but that would be incompatible with older client.
We can still do that eventually to support larger messages.
We were checking for the wrong error kind when pulling messages from the
database, which resulted in the code not retrying to pull the message,
giving flaky tests / race condition (that's present in production as
well)