Move to protobuf for Message type (#1706)
* Use a single Message type `v1/message.go` and `message.go` are the same now, and they embed `protobuf.ChatMessage`
* Use `SendChatMessage` for sending chat messages, this is basically the old `Send` but a bit more flexible so we can send different message types (stickers,commands), and not just text.
* Remove dedup from services/shhext. Because now we process in status-protocol, dedup makes less sense, as those messages are going to be processed anyway, so removing for now, we can re-evaluate if bringing it to status-go or not.
* Change the various retrieveX method to a single one:
`RetrieveAll` will be processing those messages that it can process (Currently only `Message`), and return the rest in `RawMessages` (still transit). The format for the response is:
`Chats`: -> The chats updated by receiving the message
`Messages`: -> The messages retrieved (already matched to a chat)
`Contacts`: -> The contacts updated by the messages
`RawMessages` -> Anything else that can't be parsed, eventually as we move everything to status-protocol-go this will go away.
2019-12-05 17:25:34 +01:00
|
|
|
syntax = "proto3";
|
|
|
|
|
2021-08-06 16:40:23 +01:00
|
|
|
option go_package = "./;protobuf";
|
Move to protobuf for Message type (#1706)
* Use a single Message type `v1/message.go` and `message.go` are the same now, and they embed `protobuf.ChatMessage`
* Use `SendChatMessage` for sending chat messages, this is basically the old `Send` but a bit more flexible so we can send different message types (stickers,commands), and not just text.
* Remove dedup from services/shhext. Because now we process in status-protocol, dedup makes less sense, as those messages are going to be processed anyway, so removing for now, we can re-evaluate if bringing it to status-go or not.
* Change the various retrieveX method to a single one:
`RetrieveAll` will be processing those messages that it can process (Currently only `Message`), and return the rest in `RawMessages` (still transit). The format for the response is:
`Chats`: -> The chats updated by receiving the message
`Messages`: -> The messages retrieved (already matched to a chat)
`Contacts`: -> The contacts updated by the messages
`RawMessages` -> Anything else that can't be parsed, eventually as we move everything to status-protocol-go this will go away.
2019-12-05 17:25:34 +01:00
|
|
|
package protobuf;
|
|
|
|
|
2020-07-25 12:13:23 +01:00
|
|
|
import "enums.proto";
|
2023-02-02 18:12:20 +00:00
|
|
|
import "contact.proto";
|
2020-07-25 12:13:23 +01:00
|
|
|
|
Move to protobuf for Message type (#1706)
* Use a single Message type `v1/message.go` and `message.go` are the same now, and they embed `protobuf.ChatMessage`
* Use `SendChatMessage` for sending chat messages, this is basically the old `Send` but a bit more flexible so we can send different message types (stickers,commands), and not just text.
* Remove dedup from services/shhext. Because now we process in status-protocol, dedup makes less sense, as those messages are going to be processed anyway, so removing for now, we can re-evaluate if bringing it to status-go or not.
* Change the various retrieveX method to a single one:
`RetrieveAll` will be processing those messages that it can process (Currently only `Message`), and return the rest in `RawMessages` (still transit). The format for the response is:
`Chats`: -> The chats updated by receiving the message
`Messages`: -> The messages retrieved (already matched to a chat)
`Contacts`: -> The contacts updated by the messages
`RawMessages` -> Anything else that can't be parsed, eventually as we move everything to status-protocol-go this will go away.
2019-12-05 17:25:34 +01:00
|
|
|
message StickerMessage {
|
|
|
|
string hash = 1;
|
|
|
|
int32 pack = 2;
|
|
|
|
}
|
|
|
|
|
2020-05-13 15:16:17 +02:00
|
|
|
message ImageMessage {
|
|
|
|
bytes payload = 1;
|
|
|
|
ImageType type = 2;
|
2023-02-02 12:00:49 +04:00
|
|
|
string album_id = 3;
|
|
|
|
uint32 width = 4;
|
|
|
|
uint32 height = 5;
|
2023-04-05 15:24:55 +02:00
|
|
|
uint32 album_images_count = 6;
|
2020-05-13 15:16:17 +02:00
|
|
|
}
|
|
|
|
|
2020-06-17 20:55:49 +02:00
|
|
|
message AudioMessage {
|
|
|
|
bytes payload = 1;
|
|
|
|
AudioType type = 2;
|
2020-06-23 16:30:39 +02:00
|
|
|
uint64 duration_ms = 3;
|
2020-06-17 20:55:49 +02:00
|
|
|
enum AudioType {
|
|
|
|
UNKNOWN_AUDIO_TYPE = 0;
|
|
|
|
AAC = 1;
|
|
|
|
AMR = 2;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2021-06-07 10:31:27 +02:00
|
|
|
message EditMessage {
|
|
|
|
uint64 clock = 1;
|
|
|
|
// Text of the message
|
|
|
|
string text = 2;
|
|
|
|
|
|
|
|
string chat_id = 3;
|
|
|
|
string message_id = 4;
|
|
|
|
|
|
|
|
// Grant for community edit messages
|
|
|
|
bytes grant = 5;
|
|
|
|
|
|
|
|
// The type of message (public/one-to-one/private-group-chat)
|
|
|
|
MessageType message_type = 6;
|
2022-10-05 21:54:47 +04:00
|
|
|
|
2022-10-07 10:05:02 +03:00
|
|
|
ChatMessage.ContentType content_type = 7;
|
2021-06-07 10:31:27 +02:00
|
|
|
}
|
|
|
|
|
2021-07-26 17:06:32 -04:00
|
|
|
message DeleteMessage {
|
|
|
|
uint64 clock = 1;
|
|
|
|
|
|
|
|
string chat_id = 2;
|
|
|
|
string message_id = 3;
|
|
|
|
|
|
|
|
// Grant for community delete messages
|
|
|
|
bytes grant = 4;
|
|
|
|
|
|
|
|
// The type of message (public/one-to-one/private-group-chat)
|
|
|
|
MessageType message_type = 5;
|
2023-02-01 08:57:35 +08:00
|
|
|
|
|
|
|
string deleted_by = 6;
|
2021-07-26 17:06:32 -04:00
|
|
|
}
|
|
|
|
|
2022-09-28 19:42:17 +08:00
|
|
|
message DeleteForMeMessage {
|
|
|
|
uint64 clock = 1;
|
|
|
|
string message_id = 2;
|
|
|
|
}
|
|
|
|
|
feat: introduce messenger APIs to extract discord channels
As part of the new Discord <-> Status Community Import functionality,
we're adding an API that extracts all discord categories and channels
from a previously exported discord export file.
These APIs can be used in clients to show the user what categories and
channels will be imported later on.
There are two APIs:
1. `Messenger.ExtractDiscordCategoriesAndChannels(filesToimport
[]string) (*MessengerResponse, map[string]*discord.ImportError)`
This takes a list of exported discord export (JSON) files (typically one per
channel), reads them, and extracts the categories and channels into
dedicated data structures (`[]DiscordChannel` and `[]DiscordCategory`)
It also returns the oldest message timestamp found in all extracted
channels.
The API is synchronous and returns the extracted data as
a `*MessengerResponse`. This allows to make the API available
status-go's RPC interface.
The error case is a `map[string]*discord.ImportError` where each key
is a file path of a JSON file that we tried to extract data from, and
the value a `discord.ImportError` which holds an error message and an
error code, allowing for distinguishing between "critical" errors and
"non-critical" errors.
2. `Messenger.RequestExtractDiscordCategoriesAndChannels(filesToImport
[]string)`
This is the asynchronous counterpart to
`ExtractDiscordCategoriesAndChannels`. The reason this API has been
added is because discord servers can have a lot of message and
channel data, which causes `ExtractDiscordCategoriesAndChannels` to
block the thread for too long, making apps potentially feel like they
are stuck.
This API runs inside a go routine, eventually calls
`ExtractDiscordCategoriesAndChannels`, and then emits a newly
introduced `DiscordCategoriesAndChannelsExtractedSignal` that clients
can react to.
Failure of extraction has to be determined by the
`discord.ImportErrors` emitted by the signal.
**A note about exported discord history files**
We expect users to export their discord histories via the
[DiscordChatExporter](https://github.com/Tyrrrz/DiscordChatExporter/wiki/GUI%2C-CLI-and-Formats-explained#exportguild)
tool. The tool allows to export the data in different formats, such as
JSON, HTML and CSV.
We expect users to have their data exported as JSON.
Closes: https://github.com/status-im/status-desktop/issues/6690
2022-07-13 11:33:53 +02:00
|
|
|
message DiscordMessage {
|
|
|
|
string id = 1;
|
|
|
|
string type = 2;
|
|
|
|
string timestamp = 3;
|
|
|
|
string timestampEdited = 4;
|
|
|
|
string content = 5;
|
|
|
|
DiscordMessageAuthor author = 6;
|
2022-08-04 18:16:56 +02:00
|
|
|
DiscordMessageReference reference = 7;
|
2022-09-20 11:13:44 +02:00
|
|
|
repeated DiscordMessageAttachment attachments = 8;
|
feat: introduce messenger APIs to extract discord channels
As part of the new Discord <-> Status Community Import functionality,
we're adding an API that extracts all discord categories and channels
from a previously exported discord export file.
These APIs can be used in clients to show the user what categories and
channels will be imported later on.
There are two APIs:
1. `Messenger.ExtractDiscordCategoriesAndChannels(filesToimport
[]string) (*MessengerResponse, map[string]*discord.ImportError)`
This takes a list of exported discord export (JSON) files (typically one per
channel), reads them, and extracts the categories and channels into
dedicated data structures (`[]DiscordChannel` and `[]DiscordCategory`)
It also returns the oldest message timestamp found in all extracted
channels.
The API is synchronous and returns the extracted data as
a `*MessengerResponse`. This allows to make the API available
status-go's RPC interface.
The error case is a `map[string]*discord.ImportError` where each key
is a file path of a JSON file that we tried to extract data from, and
the value a `discord.ImportError` which holds an error message and an
error code, allowing for distinguishing between "critical" errors and
"non-critical" errors.
2. `Messenger.RequestExtractDiscordCategoriesAndChannels(filesToImport
[]string)`
This is the asynchronous counterpart to
`ExtractDiscordCategoriesAndChannels`. The reason this API has been
added is because discord servers can have a lot of message and
channel data, which causes `ExtractDiscordCategoriesAndChannels` to
block the thread for too long, making apps potentially feel like they
are stuck.
This API runs inside a go routine, eventually calls
`ExtractDiscordCategoriesAndChannels`, and then emits a newly
introduced `DiscordCategoriesAndChannelsExtractedSignal` that clients
can react to.
Failure of extraction has to be determined by the
`discord.ImportErrors` emitted by the signal.
**A note about exported discord history files**
We expect users to export their discord histories via the
[DiscordChatExporter](https://github.com/Tyrrrz/DiscordChatExporter/wiki/GUI%2C-CLI-and-Formats-explained#exportguild)
tool. The tool allows to export the data in different formats, such as
JSON, HTML and CSV.
We expect users to have their data exported as JSON.
Closes: https://github.com/status-im/status-desktop/issues/6690
2022-07-13 11:33:53 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
message DiscordMessageAuthor {
|
|
|
|
string id = 1;
|
|
|
|
string name = 2;
|
|
|
|
string discriminator = 3;
|
|
|
|
string nickname = 4;
|
|
|
|
string avatarUrl = 5;
|
2022-09-15 11:05:30 +02:00
|
|
|
bytes avatarImagePayload = 6;
|
|
|
|
string localUrl = 7;
|
feat: introduce messenger APIs to extract discord channels
As part of the new Discord <-> Status Community Import functionality,
we're adding an API that extracts all discord categories and channels
from a previously exported discord export file.
These APIs can be used in clients to show the user what categories and
channels will be imported later on.
There are two APIs:
1. `Messenger.ExtractDiscordCategoriesAndChannels(filesToimport
[]string) (*MessengerResponse, map[string]*discord.ImportError)`
This takes a list of exported discord export (JSON) files (typically one per
channel), reads them, and extracts the categories and channels into
dedicated data structures (`[]DiscordChannel` and `[]DiscordCategory`)
It also returns the oldest message timestamp found in all extracted
channels.
The API is synchronous and returns the extracted data as
a `*MessengerResponse`. This allows to make the API available
status-go's RPC interface.
The error case is a `map[string]*discord.ImportError` where each key
is a file path of a JSON file that we tried to extract data from, and
the value a `discord.ImportError` which holds an error message and an
error code, allowing for distinguishing between "critical" errors and
"non-critical" errors.
2. `Messenger.RequestExtractDiscordCategoriesAndChannels(filesToImport
[]string)`
This is the asynchronous counterpart to
`ExtractDiscordCategoriesAndChannels`. The reason this API has been
added is because discord servers can have a lot of message and
channel data, which causes `ExtractDiscordCategoriesAndChannels` to
block the thread for too long, making apps potentially feel like they
are stuck.
This API runs inside a go routine, eventually calls
`ExtractDiscordCategoriesAndChannels`, and then emits a newly
introduced `DiscordCategoriesAndChannelsExtractedSignal` that clients
can react to.
Failure of extraction has to be determined by the
`discord.ImportErrors` emitted by the signal.
**A note about exported discord history files**
We expect users to export their discord histories via the
[DiscordChatExporter](https://github.com/Tyrrrz/DiscordChatExporter/wiki/GUI%2C-CLI-and-Formats-explained#exportguild)
tool. The tool allows to export the data in different formats, such as
JSON, HTML and CSV.
We expect users to have their data exported as JSON.
Closes: https://github.com/status-im/status-desktop/issues/6690
2022-07-13 11:33:53 +02:00
|
|
|
}
|
2021-06-07 10:31:27 +02:00
|
|
|
|
2022-08-04 18:16:56 +02:00
|
|
|
message DiscordMessageReference {
|
|
|
|
string messageId = 1;
|
|
|
|
string channelId = 2;
|
|
|
|
string guildId = 3;
|
|
|
|
}
|
|
|
|
|
2022-09-20 11:13:44 +02:00
|
|
|
message DiscordMessageAttachment {
|
|
|
|
string id = 1;
|
|
|
|
string messageId = 2;
|
|
|
|
string url = 3;
|
|
|
|
string fileName = 4;
|
|
|
|
uint64 fileSizeBytes = 5;
|
|
|
|
string contentType = 6;
|
|
|
|
bytes payload = 7;
|
|
|
|
string localUrl = 8;
|
|
|
|
}
|
|
|
|
|
URL unfurling (initial implementation) (#3471)
This is the initial implementation for the new URL unfurling requirements. The
most important one is that only the message sender will pay the privacy cost for
unfurling and extracting metadata from websites. Once the message is sent, the
unfurled data will be stored at the protocol level and receivers will just
profit and happily decode the metadata to render it.
Further development of this URL unfurling capability will be mostly guided by
issues created on clients. For the moment in status-mobile:
https://github.com/status-im/status-mobile/labels/url-preview
- https://github.com/status-im/status-mobile/issues/15918
- https://github.com/status-im/status-mobile/issues/15917
- https://github.com/status-im/status-mobile/issues/15910
- https://github.com/status-im/status-mobile/issues/15909
- https://github.com/status-im/status-mobile/issues/15908
- https://github.com/status-im/status-mobile/issues/15906
- https://github.com/status-im/status-mobile/issues/15905
### Terminology
In the code, I've tried to stick to the word "unfurl URL" to really mean the
process of extracting metadata from a website, sort of lower level. I use "link
preview" to mean a higher level structure which is enriched by unfurled data.
"link preview" is also how designers refer to it.
### User flows
1. Carol needs to see link previews while typing in the chat input field. Notice
from the diagram nothing is persisted and that status-go endpoints are
essentially stateless.
```
#+begin_src plantuml :results verbatim
Client->>Server: Call wakuext_getTextURLs
Server-->>Client: Normalized URLs
Client->>Client: Render cached unfurled URLs
Client->>Server: Unfurl non-cached URLs.\nCall wakuext_unfurlURLs
Server->>Website: Fetch metadata
Website-->>Server: Metadata (thumbnail URL, title, etc)
Server->>Website: Fetch thumbnail
Server->>Website: Fetch favicon
Website-->>Server: Favicon bytes
Website-->>Server: Thumbnail bytes
Server->>Server: Decode & process images
Server-->>Client: Unfurled data (thumbnail data URI, etc)
#+end_src
```
```
,------. ,------. ,-------.
|Client| |Server| |Website|
`--+---' `--+---' `---+---'
| Call wakuext_getTextURLs | |
| ---------------------------------------> |
| | |
| Normalized URLs | |
| <- - - - - - - - - - - - - - - - - - - - |
| | |
|----. | |
| | Render cached unfurled URLs | |
|<---' | |
| | |
| Unfurl non-cached URLs. | |
| Call wakuext_unfurlURLs | |
| ---------------------------------------> |
| | |
| | Fetch metadata |
| | ------------------------------------>
| | |
| | Metadata (thumbnail URL, title, etc)|
| | <- - - - - - - - - - - - - - - - - -
| | |
| | Fetch thumbnail |
| | ------------------------------------>
| | |
| | Fetch favicon |
| | ------------------------------------>
| | |
| | Favicon bytes |
| | <- - - - - - - - - - - - - - - - - -
| | |
| | Thumbnail bytes |
| | <- - - - - - - - - - - - - - - - - -
| | |
| |----. |
| | | Decode & process images |
| |<---' |
| | |
| Unfurled data (thumbnail data URI, etc)| |
| <- - - - - - - - - - - - - - - - - - - - |
,--+---. ,--+---. ,---+---.
|Client| |Server| |Website|
`------' `------' `-------'
```
2. Carol sends the text message with link previews in the RPC request
wakuext_sendChatMessages. status-go assumes the link previews are good
because it can't and shouldn't attempt to re-unfurl them.
```
#+begin_src plantuml :results verbatim
Client->>Server: Call wakuext_sendChatMessages
Server->>Server: Transform link previews to\nbe proto-marshalled
Server->DB: Write link previews serialized as JSON
Server-->>Client: Updated message response
#+end_src
```
```
,------. ,------. ,--.
|Client| |Server| |DB|
`--+---' `--+---' `+-'
| Call wakuext_sendChatMessages| |
| -----------------------------> |
| | |
| |----. |
| | | Transform link previews to |
| |<---' be proto-marshalled |
| | |
| | |
| | Write link previews serialized as JSON|
| | -------------------------------------->
| | |
| Updated message response | |
| <- - - - - - - - - - - - - - - |
,--+---. ,--+---. ,+-.
|Client| |Server| |DB|
`------' `------' `--'
```
3. The message was sent over waku and persisted locally in Carol's device. She
should now see the link previews in the chat history. There can be many link
previews shared by other chat members, therefore it is important to serve the
assets via the media server to avoid overloading the ReactNative bridge with
lots of big JSON payloads containing base64 encoded data URIs (maybe this
concern is meaningless for desktop). When a client is rendering messages with
link previews, they will have the field linkPreviews, and the thumbnail URL
will point to the local media server.
```
#+begin_src plantuml :results verbatim
Client->>Server: GET /link-preview/thumbnail (media server)
Server->>DB: Read from user_messages.unfurled_links
Server->Server: Unmarshal JSON
Server-->>Client: HTTP Content-Type: image/jpeg/etc
#+end_src
```
```
,------. ,------. ,--.
|Client| |Server| |DB|
`--+---' `--+---' `+-'
| GET /link-preview/thumbnail (media server)| |
| ------------------------------------------> |
| | |
| | Read from user_messages.unfurled_links|
| | -------------------------------------->
| | |
| |----. |
| | | Unmarshal JSON |
| |<---' |
| | |
| HTTP Content-Type: image/jpeg/etc | |
| <- - - - - - - - - - - - - - - - - - - - - |
,--+---. ,--+---. ,+-.
|Client| |Server| |DB|
`------' `------' `--'
```
### Some limitations of the current implementation
The following points will become separate issues in status-go that I'll work on
over the next couple weeks. In no order of importance:
- Improve how multiple links are fetched; retries on failure and testing how
unfurling behaves around the timeout limits (deterministically, not by making
real HTTP calls as I did). https://github.com/status-im/status-go/issues/3498
- Unfurl favicons and store them in the protobuf too.
- For this PR, I added unfurling support only for websites with OpenGraph
https://ogp.me/ meta tags. Other unfurlers will be implemented on demand. The
next one will probably be for oEmbed https://oembed.com/, the protocol
supported by YouTube, for example.
- Resize and/or compress thumbnails (and favicons). Often times, thumbnails are
huge for the purposes of link previews. There is already support for
compressing JPEGs in status-go, but I prefer to work with compression in a
separate PR because I'd like to also solve the problem for PNGs (probably
convert them to JPEGs, plus compress them). This would be a safe choice for
thumbnails, favicons not so much because transparency is desirable.
- Editing messages is not yet supported.
- I haven't coded any artificial limit on the number of previews or on the size
of the thumbnail payload. This will be done in a separate issue. I have heard
the ideal solution may be to split messages into smaller chunks of ~125 KiB
because of libp2p, but that might be too complicated at this stage of the
product (?).
- Link preview deletion.
- For the moment, OpenGraph metadata is extracted by requesting data for the
English language (and fallback to whatever is available). In the future, we'll
want to unfurl by respecting the user's local device language. Some websites,
like GoDaddy, are already localized based on the device's IP, but many aren't.
- The website's description text should be limited by a certain number of
characters, especially because it's outside our control. Exactly how much has
not been decided yet, so it'll be done separately.
- URL normalization can be tricky, so I implemented only the basics to help with
caching. For example, the url https://status.im and HTTPS://status.im are
considered identical. Also, a URL is considered valid for unfurling if its TLD
exists according to publicsuffix.EffectiveTLDPlusOne. This was essential,
otherwise the default Go url.Parse approach would consider many invalid URLs
valid, and thus the server would waste resources trying to unfurl the
unfurleable.
### Other requirements
- If the message is edited, the link previews should reflect the edited text,
not the original one. This has been aligned with the design team as well.
- If the website's thumbnail or the favicon can't be fetched, just ignore them.
The only mandatory piece of metadata is the website's title and URL.
- Link previews in clients should be generated in near real-time, that is, as
the user types, previews are updated. In mobile this performs very well, and
it's what other clients like WhatsApp, Telegram, and Facebook do.
### Decisions
- While the user typing in the input field, the client is constantly (debounced)
asking status-go to parse the text and extract normalized URLs and then the
client checks if they're already in its in-memory cache. If they are, no RPC
call is made. I chose this approach to achieve the best possible performance
in mobile and avoid the whole RPC overhead, since the chat experience is
already not smooth enough. The mobile client uses URLs as cache keys in a
hashmap, i.e. if the key is present, it means the preview is readily available
(naive, but good enough for now). This decision also gave me more flexibility
to find the best UX at this stage of the feature.
- Due to the requirement that users should be able to see independent loading
indicators for each link preview, when status-go can't unfurl a URL, it
doesn't return it in the response.
- As an initial implementation, I added the BLOB column unfurled_links to the
user_messages table. The preview data is then serialized as JSON before being
stored in this column. I felt that creating a separate table and the related
code for this initial PR would be inconvenient. Is that reasonable to you?
Once things stabilize I can create a proper table if we want to avoid this
kind of solution with serialized columns.
2023-05-18 15:43:06 -03:00
|
|
|
message UnfurledLink {
|
|
|
|
// A valid URL which uniquely identifies this link.
|
|
|
|
string url = 1;
|
|
|
|
// Website's title.
|
|
|
|
string title = 2;
|
|
|
|
// Description is sometimes available, but can be empty. Most mainstream
|
|
|
|
// websites provide this information.
|
|
|
|
string description = 3;
|
|
|
|
bytes thumbnail_payload = 4;
|
|
|
|
uint32 thumbnail_width = 5;
|
|
|
|
uint32 thumbnail_height = 6;
|
|
|
|
}
|
|
|
|
|
Move to protobuf for Message type (#1706)
* Use a single Message type `v1/message.go` and `message.go` are the same now, and they embed `protobuf.ChatMessage`
* Use `SendChatMessage` for sending chat messages, this is basically the old `Send` but a bit more flexible so we can send different message types (stickers,commands), and not just text.
* Remove dedup from services/shhext. Because now we process in status-protocol, dedup makes less sense, as those messages are going to be processed anyway, so removing for now, we can re-evaluate if bringing it to status-go or not.
* Change the various retrieveX method to a single one:
`RetrieveAll` will be processing those messages that it can process (Currently only `Message`), and return the rest in `RawMessages` (still transit). The format for the response is:
`Chats`: -> The chats updated by receiving the message
`Messages`: -> The messages retrieved (already matched to a chat)
`Contacts`: -> The contacts updated by the messages
`RawMessages` -> Anything else that can't be parsed, eventually as we move everything to status-protocol-go this will go away.
2019-12-05 17:25:34 +01:00
|
|
|
message ChatMessage {
|
|
|
|
// Lamport timestamp of the chat message
|
|
|
|
uint64 clock = 1;
|
2022-09-28 19:42:17 +08:00
|
|
|
// Unix timestamps in milliseconds, currently not used as we use whisper as
|
|
|
|
// more reliable, but here so that we don't rely on it
|
Move to protobuf for Message type (#1706)
* Use a single Message type `v1/message.go` and `message.go` are the same now, and they embed `protobuf.ChatMessage`
* Use `SendChatMessage` for sending chat messages, this is basically the old `Send` but a bit more flexible so we can send different message types (stickers,commands), and not just text.
* Remove dedup from services/shhext. Because now we process in status-protocol, dedup makes less sense, as those messages are going to be processed anyway, so removing for now, we can re-evaluate if bringing it to status-go or not.
* Change the various retrieveX method to a single one:
`RetrieveAll` will be processing those messages that it can process (Currently only `Message`), and return the rest in `RawMessages` (still transit). The format for the response is:
`Chats`: -> The chats updated by receiving the message
`Messages`: -> The messages retrieved (already matched to a chat)
`Contacts`: -> The contacts updated by the messages
`RawMessages` -> Anything else that can't be parsed, eventually as we move everything to status-protocol-go this will go away.
2019-12-05 17:25:34 +01:00
|
|
|
uint64 timestamp = 2;
|
|
|
|
// Text of the message
|
|
|
|
string text = 3;
|
|
|
|
// Id of the message that we are replying to
|
|
|
|
string response_to = 4;
|
|
|
|
// Ens name of the sender
|
|
|
|
string ens_name = 5;
|
|
|
|
// Chat id, this field is symmetric for public-chats and private group chats,
|
|
|
|
// but asymmetric in case of one-to-ones, as the sender will use the chat-id
|
|
|
|
// of the received, while the receiver will use the chat-id of the sender.
|
2022-09-28 19:42:17 +08:00
|
|
|
// Probably should be the concatenation of sender-pk & receiver-pk in
|
|
|
|
// alphabetical order
|
Move to protobuf for Message type (#1706)
* Use a single Message type `v1/message.go` and `message.go` are the same now, and they embed `protobuf.ChatMessage`
* Use `SendChatMessage` for sending chat messages, this is basically the old `Send` but a bit more flexible so we can send different message types (stickers,commands), and not just text.
* Remove dedup from services/shhext. Because now we process in status-protocol, dedup makes less sense, as those messages are going to be processed anyway, so removing for now, we can re-evaluate if bringing it to status-go or not.
* Change the various retrieveX method to a single one:
`RetrieveAll` will be processing those messages that it can process (Currently only `Message`), and return the rest in `RawMessages` (still transit). The format for the response is:
`Chats`: -> The chats updated by receiving the message
`Messages`: -> The messages retrieved (already matched to a chat)
`Contacts`: -> The contacts updated by the messages
`RawMessages` -> Anything else that can't be parsed, eventually as we move everything to status-protocol-go this will go away.
2019-12-05 17:25:34 +01:00
|
|
|
string chat_id = 6;
|
|
|
|
|
|
|
|
// The type of message (public/one-to-one/private-group-chat)
|
|
|
|
MessageType message_type = 7;
|
|
|
|
// The type of the content of the message
|
|
|
|
ContentType content_type = 8;
|
|
|
|
|
|
|
|
oneof payload {
|
|
|
|
StickerMessage sticker = 9;
|
2020-05-13 15:16:17 +02:00
|
|
|
ImageMessage image = 10;
|
2020-06-17 20:55:49 +02:00
|
|
|
AudioMessage audio = 11;
|
2020-11-18 10:16:51 +01:00
|
|
|
bytes community = 12;
|
2022-08-04 18:16:56 +02:00
|
|
|
DiscordMessage discord_message = 99;
|
Move to protobuf for Message type (#1706)
* Use a single Message type `v1/message.go` and `message.go` are the same now, and they embed `protobuf.ChatMessage`
* Use `SendChatMessage` for sending chat messages, this is basically the old `Send` but a bit more flexible so we can send different message types (stickers,commands), and not just text.
* Remove dedup from services/shhext. Because now we process in status-protocol, dedup makes less sense, as those messages are going to be processed anyway, so removing for now, we can re-evaluate if bringing it to status-go or not.
* Change the various retrieveX method to a single one:
`RetrieveAll` will be processing those messages that it can process (Currently only `Message`), and return the rest in `RawMessages` (still transit). The format for the response is:
`Chats`: -> The chats updated by receiving the message
`Messages`: -> The messages retrieved (already matched to a chat)
`Contacts`: -> The contacts updated by the messages
`RawMessages` -> Anything else that can't be parsed, eventually as we move everything to status-protocol-go this will go away.
2019-12-05 17:25:34 +01:00
|
|
|
}
|
|
|
|
|
2020-11-18 10:16:51 +01:00
|
|
|
// Grant for community chat messages
|
|
|
|
bytes grant = 13;
|
|
|
|
|
2022-02-17 11:13:10 -04:00
|
|
|
// Message author's display name, introduced in version 1
|
|
|
|
string display_name = 14;
|
|
|
|
|
2023-02-02 18:12:20 +00:00
|
|
|
ContactRequestPropagatedState contact_request_propagated_state = 15;
|
2022-01-18 16:31:34 +00:00
|
|
|
|
URL unfurling (initial implementation) (#3471)
This is the initial implementation for the new URL unfurling requirements. The
most important one is that only the message sender will pay the privacy cost for
unfurling and extracting metadata from websites. Once the message is sent, the
unfurled data will be stored at the protocol level and receivers will just
profit and happily decode the metadata to render it.
Further development of this URL unfurling capability will be mostly guided by
issues created on clients. For the moment in status-mobile:
https://github.com/status-im/status-mobile/labels/url-preview
- https://github.com/status-im/status-mobile/issues/15918
- https://github.com/status-im/status-mobile/issues/15917
- https://github.com/status-im/status-mobile/issues/15910
- https://github.com/status-im/status-mobile/issues/15909
- https://github.com/status-im/status-mobile/issues/15908
- https://github.com/status-im/status-mobile/issues/15906
- https://github.com/status-im/status-mobile/issues/15905
### Terminology
In the code, I've tried to stick to the word "unfurl URL" to really mean the
process of extracting metadata from a website, sort of lower level. I use "link
preview" to mean a higher level structure which is enriched by unfurled data.
"link preview" is also how designers refer to it.
### User flows
1. Carol needs to see link previews while typing in the chat input field. Notice
from the diagram nothing is persisted and that status-go endpoints are
essentially stateless.
```
#+begin_src plantuml :results verbatim
Client->>Server: Call wakuext_getTextURLs
Server-->>Client: Normalized URLs
Client->>Client: Render cached unfurled URLs
Client->>Server: Unfurl non-cached URLs.\nCall wakuext_unfurlURLs
Server->>Website: Fetch metadata
Website-->>Server: Metadata (thumbnail URL, title, etc)
Server->>Website: Fetch thumbnail
Server->>Website: Fetch favicon
Website-->>Server: Favicon bytes
Website-->>Server: Thumbnail bytes
Server->>Server: Decode & process images
Server-->>Client: Unfurled data (thumbnail data URI, etc)
#+end_src
```
```
,------. ,------. ,-------.
|Client| |Server| |Website|
`--+---' `--+---' `---+---'
| Call wakuext_getTextURLs | |
| ---------------------------------------> |
| | |
| Normalized URLs | |
| <- - - - - - - - - - - - - - - - - - - - |
| | |
|----. | |
| | Render cached unfurled URLs | |
|<---' | |
| | |
| Unfurl non-cached URLs. | |
| Call wakuext_unfurlURLs | |
| ---------------------------------------> |
| | |
| | Fetch metadata |
| | ------------------------------------>
| | |
| | Metadata (thumbnail URL, title, etc)|
| | <- - - - - - - - - - - - - - - - - -
| | |
| | Fetch thumbnail |
| | ------------------------------------>
| | |
| | Fetch favicon |
| | ------------------------------------>
| | |
| | Favicon bytes |
| | <- - - - - - - - - - - - - - - - - -
| | |
| | Thumbnail bytes |
| | <- - - - - - - - - - - - - - - - - -
| | |
| |----. |
| | | Decode & process images |
| |<---' |
| | |
| Unfurled data (thumbnail data URI, etc)| |
| <- - - - - - - - - - - - - - - - - - - - |
,--+---. ,--+---. ,---+---.
|Client| |Server| |Website|
`------' `------' `-------'
```
2. Carol sends the text message with link previews in the RPC request
wakuext_sendChatMessages. status-go assumes the link previews are good
because it can't and shouldn't attempt to re-unfurl them.
```
#+begin_src plantuml :results verbatim
Client->>Server: Call wakuext_sendChatMessages
Server->>Server: Transform link previews to\nbe proto-marshalled
Server->DB: Write link previews serialized as JSON
Server-->>Client: Updated message response
#+end_src
```
```
,------. ,------. ,--.
|Client| |Server| |DB|
`--+---' `--+---' `+-'
| Call wakuext_sendChatMessages| |
| -----------------------------> |
| | |
| |----. |
| | | Transform link previews to |
| |<---' be proto-marshalled |
| | |
| | |
| | Write link previews serialized as JSON|
| | -------------------------------------->
| | |
| Updated message response | |
| <- - - - - - - - - - - - - - - |
,--+---. ,--+---. ,+-.
|Client| |Server| |DB|
`------' `------' `--'
```
3. The message was sent over waku and persisted locally in Carol's device. She
should now see the link previews in the chat history. There can be many link
previews shared by other chat members, therefore it is important to serve the
assets via the media server to avoid overloading the ReactNative bridge with
lots of big JSON payloads containing base64 encoded data URIs (maybe this
concern is meaningless for desktop). When a client is rendering messages with
link previews, they will have the field linkPreviews, and the thumbnail URL
will point to the local media server.
```
#+begin_src plantuml :results verbatim
Client->>Server: GET /link-preview/thumbnail (media server)
Server->>DB: Read from user_messages.unfurled_links
Server->Server: Unmarshal JSON
Server-->>Client: HTTP Content-Type: image/jpeg/etc
#+end_src
```
```
,------. ,------. ,--.
|Client| |Server| |DB|
`--+---' `--+---' `+-'
| GET /link-preview/thumbnail (media server)| |
| ------------------------------------------> |
| | |
| | Read from user_messages.unfurled_links|
| | -------------------------------------->
| | |
| |----. |
| | | Unmarshal JSON |
| |<---' |
| | |
| HTTP Content-Type: image/jpeg/etc | |
| <- - - - - - - - - - - - - - - - - - - - - |
,--+---. ,--+---. ,+-.
|Client| |Server| |DB|
`------' `------' `--'
```
### Some limitations of the current implementation
The following points will become separate issues in status-go that I'll work on
over the next couple weeks. In no order of importance:
- Improve how multiple links are fetched; retries on failure and testing how
unfurling behaves around the timeout limits (deterministically, not by making
real HTTP calls as I did). https://github.com/status-im/status-go/issues/3498
- Unfurl favicons and store them in the protobuf too.
- For this PR, I added unfurling support only for websites with OpenGraph
https://ogp.me/ meta tags. Other unfurlers will be implemented on demand. The
next one will probably be for oEmbed https://oembed.com/, the protocol
supported by YouTube, for example.
- Resize and/or compress thumbnails (and favicons). Often times, thumbnails are
huge for the purposes of link previews. There is already support for
compressing JPEGs in status-go, but I prefer to work with compression in a
separate PR because I'd like to also solve the problem for PNGs (probably
convert them to JPEGs, plus compress them). This would be a safe choice for
thumbnails, favicons not so much because transparency is desirable.
- Editing messages is not yet supported.
- I haven't coded any artificial limit on the number of previews or on the size
of the thumbnail payload. This will be done in a separate issue. I have heard
the ideal solution may be to split messages into smaller chunks of ~125 KiB
because of libp2p, but that might be too complicated at this stage of the
product (?).
- Link preview deletion.
- For the moment, OpenGraph metadata is extracted by requesting data for the
English language (and fallback to whatever is available). In the future, we'll
want to unfurl by respecting the user's local device language. Some websites,
like GoDaddy, are already localized based on the device's IP, but many aren't.
- The website's description text should be limited by a certain number of
characters, especially because it's outside our control. Exactly how much has
not been decided yet, so it'll be done separately.
- URL normalization can be tricky, so I implemented only the basics to help with
caching. For example, the url https://status.im and HTTPS://status.im are
considered identical. Also, a URL is considered valid for unfurling if its TLD
exists according to publicsuffix.EffectiveTLDPlusOne. This was essential,
otherwise the default Go url.Parse approach would consider many invalid URLs
valid, and thus the server would waste resources trying to unfurl the
unfurleable.
### Other requirements
- If the message is edited, the link previews should reflect the edited text,
not the original one. This has been aligned with the design team as well.
- If the website's thumbnail or the favicon can't be fetched, just ignore them.
The only mandatory piece of metadata is the website's title and URL.
- Link previews in clients should be generated in near real-time, that is, as
the user types, previews are updated. In mobile this performs very well, and
it's what other clients like WhatsApp, Telegram, and Facebook do.
### Decisions
- While the user typing in the input field, the client is constantly (debounced)
asking status-go to parse the text and extract normalized URLs and then the
client checks if they're already in its in-memory cache. If they are, no RPC
call is made. I chose this approach to achieve the best possible performance
in mobile and avoid the whole RPC overhead, since the chat experience is
already not smooth enough. The mobile client uses URLs as cache keys in a
hashmap, i.e. if the key is present, it means the preview is readily available
(naive, but good enough for now). This decision also gave me more flexibility
to find the best UX at this stage of the feature.
- Due to the requirement that users should be able to see independent loading
indicators for each link preview, when status-go can't unfurl a URL, it
doesn't return it in the response.
- As an initial implementation, I added the BLOB column unfurled_links to the
user_messages table. The preview data is then serialized as JSON before being
stored in this column. I felt that creating a separate table and the related
code for this initial PR would be inconvenient. Is that reasonable to you?
Once things stabilize I can create a proper table if we want to avoid this
kind of solution with serialized columns.
2023-05-18 15:43:06 -03:00
|
|
|
repeated UnfurledLink unfurled_links = 16;
|
|
|
|
|
Move to protobuf for Message type (#1706)
* Use a single Message type `v1/message.go` and `message.go` are the same now, and they embed `protobuf.ChatMessage`
* Use `SendChatMessage` for sending chat messages, this is basically the old `Send` but a bit more flexible so we can send different message types (stickers,commands), and not just text.
* Remove dedup from services/shhext. Because now we process in status-protocol, dedup makes less sense, as those messages are going to be processed anyway, so removing for now, we can re-evaluate if bringing it to status-go or not.
* Change the various retrieveX method to a single one:
`RetrieveAll` will be processing those messages that it can process (Currently only `Message`), and return the rest in `RawMessages` (still transit). The format for the response is:
`Chats`: -> The chats updated by receiving the message
`Messages`: -> The messages retrieved (already matched to a chat)
`Contacts`: -> The contacts updated by the messages
`RawMessages` -> Anything else that can't be parsed, eventually as we move everything to status-protocol-go this will go away.
2019-12-05 17:25:34 +01:00
|
|
|
enum ContentType {
|
2019-12-02 16:34:05 +01:00
|
|
|
UNKNOWN_CONTENT_TYPE = 0;
|
|
|
|
TEXT_PLAIN = 1;
|
|
|
|
STICKER = 2;
|
|
|
|
STATUS = 3;
|
|
|
|
EMOJI = 4;
|
2020-01-10 19:59:01 +01:00
|
|
|
TRANSACTION_COMMAND = 5;
|
2020-01-28 12:16:28 +01:00
|
|
|
// Only local
|
|
|
|
SYSTEM_MESSAGE_CONTENT_PRIVATE_GROUP = 6;
|
2020-05-13 15:16:17 +02:00
|
|
|
IMAGE = 7;
|
2020-06-17 20:55:49 +02:00
|
|
|
AUDIO = 8;
|
2020-11-18 10:16:51 +01:00
|
|
|
COMMUNITY = 9;
|
2021-03-25 16:15:22 +01:00
|
|
|
// Only local
|
|
|
|
SYSTEM_MESSAGE_GAP = 10;
|
2022-01-18 16:31:34 +00:00
|
|
|
CONTACT_REQUEST = 11;
|
2022-08-04 18:16:56 +02:00
|
|
|
DISCORD_MESSAGE = 12;
|
2022-08-31 15:41:58 +01:00
|
|
|
IDENTITY_VERIFICATION = 13;
|
2023-04-17 15:44:48 +01:00
|
|
|
// Only local
|
|
|
|
SYSTEM_MESSAGE_PINNED_MESSAGE = 14;
|
2023-06-08 16:00:19 +04:00
|
|
|
SYSTEM_MESSAGE_MUTUAL_STATE_UPDATE = 15;
|
Move to protobuf for Message type (#1706)
* Use a single Message type `v1/message.go` and `message.go` are the same now, and they embed `protobuf.ChatMessage`
* Use `SendChatMessage` for sending chat messages, this is basically the old `Send` but a bit more flexible so we can send different message types (stickers,commands), and not just text.
* Remove dedup from services/shhext. Because now we process in status-protocol, dedup makes less sense, as those messages are going to be processed anyway, so removing for now, we can re-evaluate if bringing it to status-go or not.
* Change the various retrieveX method to a single one:
`RetrieveAll` will be processing those messages that it can process (Currently only `Message`), and return the rest in `RawMessages` (still transit). The format for the response is:
`Chats`: -> The chats updated by receiving the message
`Messages`: -> The messages retrieved (already matched to a chat)
`Contacts`: -> The contacts updated by the messages
`RawMessages` -> Anything else that can't be parsed, eventually as we move everything to status-protocol-go this will go away.
2019-12-05 17:25:34 +01:00
|
|
|
}
|
|
|
|
}
|