Writing a Twilio bridge
Posted on • 4486 words
In this post, I'll go over all the steps necessary to build a Twilio bridge using the new bridgev2 module in mautrix-go. The whole bridge can be found at github.com/mautrix/twilio.
Getting started with a new Go project
The first step to make a new Go project is to create a new directory and run
go mod init <module path>
, where <module path>
is the import path of your
new module (for example, the GitHub repo). In addition to that, we'll want to
add the mautrix-go and Twilio libraries as dependencies. Since bridgev2 is
under active development, we'll ask for @main
instead of the latest tag.
1 2 3 |
|
When naming your bridge, please make up your own name and don't use mautrix-*
.
The connector itself
The next step is creating the network connector itself. The connector is
effectively a library, so we'll put it in pkg/connector/
and create a file
called connector.go
. Because this is a minimal example, that file will also be
the only file in the connector package, but real connectors will probably want
to split up the parts that come later into different files.
Inside the file, let's start by defining a struct called TwilioConnector
. This
struct is the main entrypoint to the connector. It is passed to the central
bridge module and is used to initialize other things.
1 2 3 4 5 6 7 8 9 |
|
Then, add a line like this:
1
|
|
This is a conventional method of ensuring that a struct implements a specific
interface. The line is creating a value of type *TwilioConnector
, then
assigning it to a variable of type bridgev2.NetworkConnector
, which will only
compile if the value implements the interface. The variable name is the blank
identifier _
, which means the value will be discarded after being evaluated.
If you're using a smart editor, it should complain that *TwilioConnector
does
not in fact implement bridgev2.NetworkConnector
, and possibly even offer you
a quick way to create stub methods to implement the interface.
Init
The Init function is called when the bridge is initializing all types. It also gives you access to the bridge struct, which will need to be stored for later.
This function should not do any kind of IO or other complicated operations, it should just initialize the in-memory struct.
1 2 3 |
|
Start
The Start function is called slightly later in the startup. This can be used for bridge-wide IO operations, such as upgrading database schemas if the connector needs its own database tables.
In the case of Twilio, there's no need for special database tables, but we do need to register some routes, as receiving messages requires a webhook. Other networks that receive events via websockets/polling/etc. may not need to do anything at all here.
1 2 3 4 5 6 7 8 9 10 11 12 13 |
|
We'll come back to ReceiveMessage
later.
GetCapabilities
The GetCapabilities
function on the network connector is used to signal some
bridge-wide capabilities, like disappearing message support. Twilio doesn't
have any of the relevant features, so we'll just leave this empty.
1 2 3 |
|
GetName
The GetName
function is used to customize the name of the bridge.
DisplayName
is a simple human-readable name for the network. It doesn't have any particular rules. It usually starts with a capital letter. This is used in lots of places.NetworkURL
is the website associated with the network. This is used in theprotocol
section ofm.bridge
events.NetworkIcon
is amxc://
URI which contains the logo of the network. This is used in theprotocol
section ofm.bridge
events, as well as in the avatar of the bridge bot user.NetworkID
is a string that uniquely identifies the network. If there are multiple bridge implementations for the same network, they should use the same ID. This is conventionally all lowercase.BeeperBridgeType
identifies the specific bridge implementation. The Go module import path is a good option for this to ensure uniqueness, but bridges used by Beeper use shorter types (e.g. the Go rewrite of the Discord bridge useddiscordgo
).DefaultPort
can optionally be set to change the default port when generating the example config. It is not required and will default to8008
when unset. All mautrix bridges use ports defined in mau.fi/ports.DefaultCommandPrefix
can optionally be set to change the default command prefix when generating the example config. It is not required and will default to!
followed by theNetworkID
.
1 2 3 4 5 6 7 8 9 10 |
|
GetConfig
Network connectors can define their own config fields, which for normal bridges
using mxmain
will be in the network:
section of the config.
The GetConfig
function returns all data that is needed to provide the config.
example
is the example config.data
is a pointer to the object where the config should be decoded to.upgrader
is a helper to perform config upgrades.
On startup, the bridge will read the user's config file as well as the example config, then call the upgrader to copy values from the user's config into the example, and finally overwrite the user's config with the example. Users can disable the overwriting part if they don't like it, but the first two steps are done in any case. There are two benefits to this system:
- the bridge doesn't need to have any backwards-compatibility for the config outside the upgrader function. The upgrader can simply copy fields from old locations into new ones.
- the user can easily get an upgraded config without having to manually figure out which fields have changed.
The Twilio connector doesn't need any special fields, so we can just return nil values.
1 2 3 |
|
If you did want config fields, the response would look something like this:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 |
|
and you'd have pkg/connector/example-config.yaml
with
1 2 |
|
GetDBMetaTypes
The central bridge module has its own database where it stores things like
room mappings, remote user info, logins, etc. The database columns cover fields
that all bridges need like displaynames, but there are often a few
network-specific fields that are necessary as well. To support such fields, the
database also has a metadata
column, which is just arbitrary JSON data.
To make the JSON data easier to consume, the network connector must provide
typed structs for the tables it wants to add metadata to. The GetDBMetaTypes
function returns functions that create new instances of each of the structs.
For Twilio, we want to store the credentials in UserLogins, but don't really need any other metadata, so we can omit all the other fields.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 |
|
After the struct is defined, we can safely cast the Metadata
field in the
relevant database structs into our metadata struct. We'll already use it in
LoadUserLogin
below. We also set the values in the login section.
LoadUserLogin
LoadUserLogin
is called when the bridge wants to prepare an existing login
for connection. This is where the NetworkAPI
interface comes in: the primary
purpose is to fill the Client
property of UserLogin
with the network
client. This function should not do anything else, actually connecting to the
remote network (if applicable) happens later in NetworkAPI.Connect
.
We'll initialize the go-twilio client here. We're also initializing a
RequestValidator
. It's used for the webhooks, which we'll come back to later.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 |
|
The network API
Next, we'll need to actually define TwilioClient
and implement the NetworkAPI
.
1 2 3 4 5 6 7 |
|
Like with the network connector, we'll do the same interface implementation
assertion, In this case it's not technically necessary, as we're already
assigning a &TwilioClient{}
to UserLogin.Client
, which does the same
check. However, if you implement any of the optional extra interfaces, then the
explicit assertions become very useful, as there's nothing else ensuring the
correct functions are implemented.
Connect
For most networks which use persistent connections, this is where you'd set up the connection. Twilio doesn't use a persistent connection, so technically we don't need to do anything here. However, we should still check access token validity here.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 |
|
Disconnect
For networks with persistent connections, Disconnect should tear down the connection. Twilio doesn't have a persistent connection, so we don't need to do anything here.
1
|
|
IsLoggedIn
On some networks, logins can be invalidated, so this function is used to check if the login is still valid. For Twilio, we'll just return true. Note that this method is not meant to do any IO, it should just return cached values.
1 2 3 |
|
LogoutRemote
This method is meant to invalidate remote network credentials and disconnect from the network. Since Twilio doesn't have credentials that can be invalidated, nor a persistent connection that can be disconnected, we don't need to do anything here.
1
|
|
GetCapabilities
This is similar to the network connector's GetCapabilities method, but is scoped to a user login and a portal. Currently, these fields are only used to check events before passing them to the network connector. Some of the fields are not used at all yet. The plan is to also send these fields to the room as a state event, so that clients could display limits directly to the user. The state event will likely use MSC4110 (or at least something similar).
For now, we don't really need to define any fields, but let's include Twilio's maximum message length.
1 2 3 4 5 |
|
Identifiers
Before we get to the next functions in NetworkAPI
, we'll need to cover network
identifiers. Network IDs are opaque identifiers for various things on the remote
network: logins, users, chats, messages, etc. Each identifier has its own type
in the networkid module, which ensures that you can't accidentally mix up types.
All the types are just strings behind the scenes.
All identifiers are generated by the network connector and won't be parsed by
any other component. Other components also will not make any assumptions about
different identifier types being similar, but the network connector itself is of
course allowed to define some types are equal. For example, most networks (but
not all) will define that UserLoginID
s are the same as UserID
s. However,
identifiers do have some uniqueness expectations that the network connector must
meet.
For Twilio, we'll define the identifiers as follows:
UserID
s are E.164 phone numbers without the leading+
.PortalID
s are equivalent toUserID
s.MessageID
s are Twilio message SIDs.UserLoginID
s are account SID and phone SID joined with a:
.
For convenience, we'll define some functions to cast strings into those types:
1 2 3 4 5 6 7 8 9 10 11 |
|
See the networkid module godocs for docs on all the different types of identifiers.
IsThisUser
Since UserID
s and UserLoginID
s are not interchangeable, we need to provide
some way for the bridge to determine if a given user ID belongs to a user login.
For most networks where user and login IDs are the same, you can just check for
equality.
For this bridge, we're segregating different logins to have their own portals,
which means this function is not actually necessary, and we could just hardcode
it to return false
. It's not hard to implement though, so let's do it anyway.
1 2 3 4 |
|
If you were to define UserLoginID and UserID the same way, you could have an even simpler check:
1 2 3 |
|
GetChatInfo
GetChatInfo
returns the info for a given chat. All the values in the response
struct are pointers, which means they can be omitted to tell the bridge that the
corresponding room state event shouldn't be modified. For example, DMs generally
don't have names, topics or avatars. However, even DMs do have members. The
member list should always include all participants, so both the Matrix user and
the remote user in DMs.
We're only handling DMs in this bridge, so we don't need to return anything other than the member list.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 |
|
GetUserInfo
GetUserInfo
is basically the same as GetChatInfo
, but it returns info for a
given user instead of a chat. The returned info will be applied as the ghost
user's profile on Matrix.
Because we're bridging SMS and don't have a contact list, we don't really have
any other info than the phone number itself. If we wanted to be fancy, we could
format the phone number nicely for Name
, but I couldn't find any convenient
libraries similar to phonenumbers for Python.
In addition to Name
, we set Identifiers
which is a list of URIs that
represent the user. In this case, we're using the tel:
scheme, but you could
also include network-specific @usernames with a custom scheme here.
1 2 3 4 5 6 |
|
The login process
In the first section about the network connector, we skipped the GetLoginFlows
and CreateLogin
functions, so let's get back to those.
GetLoginFlows
returns the ways that can be used to log into the bridge. Just
an internal ID, a human-readable name and a brief description. The internal ID
of the flow the user picked is then passed to CreateLogin
. The return type of
CreateLogin
is the third and final primary interface, LoginProcess
.
Login process is meant to be a simple abstraction over login flows to arbitrary remote networks. It has three different step types that can hopefully be used to build any login flow there is:
- User input: fairly self-explanatory, ask the user to give values for one or more fields.
- Cookies: display a webview to the user and extract cookies, localStorage or other things after completion. For non-graphical logins (like using the bridge bot), this will ask the user to manually go to the website and extract the relevant values. If only cookies are necessary, the extraction can be done by using the "Copy as cURL" feature in browser devtools and pasting the result to the bridge bot.
- Display and wait: display something to the user, then wait until the remote network returns a response. This is used for things like QR logins or other flows where the user has to do something on an existing login.
Every step also has an internal identifier (reverse java package naming style is recommended), general instructions for the entire step, and type-specific parameters.
In addition to the three real step types, there's a fourth special type indicating the login was successful.
GetLoginFlows
& CreateLogin
In the case of Twilio, the user just needs to provide their API keys, so we'll use the user input type. First, we'll implement the two functions in the network connector.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
|
TwilioLogin
Then we need to define the actual TwilioLogin
type that CreateLogin
returns:
1 2 3 4 5 6 7 |
|
We have a bunch of extra fields in addition to the User
. They are used to
store data when there are multiple login steps. Specifically, if the Twilio
account has more than one phone number, we'll return a second step asking which
one to use.
Here the interface implementation assertion is quite important. Returning
TwilioLogin
from CreateLogin
ensures that the interface implements
bridgev2.LoginProcess
, but most login flows also need to implement one or more
of the step type specific interfaces. In this case, we're using the user input
type, so we want to make sure bridgev2.LoginProcessUserInput
is implemented.
1
|
|
After that, we'll have three methods that need to be implemented: Start
,
SubmitUserInput
and Cancel
.
Start
Start returns the first step of the login process. For other networks that require a connection, this is probably also where the connection would be established. For Twilio, we don't have anything to connect to initially, we just want the user to provide their account SID and auth token.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 |
|
SubmitUserInput
and finishing the login
After the user provides the values, we'll get a call to SubmitUserInput
.
This will be a more complicated function. First, we need to validate the
credentials and get a list of phone numbers available on the Twilio account.
After that, we either finish the login if there's only one number, or ask the
user which one to use if there are multiple. If we ask the user, then we'll get
another call to SubmitUserInput
, which means we need to remember the data from
the first call. After a successful login, we prepare the UserLogin
instance.
Let's split up the function to keep it more readable. First, SubmitUserInput
itself. We have two paths, so we'll just split it into two calls. If Client
is
not set in TwilioLogin
, we're in the first step where we want API keys. If it
is set, we want to choose a phone number.
1 2 3 4 5 6 7 |
|
Then the API key submit function.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 |
|
Choosing the phone number is fairly simple, as we already have a valid token and have fetched the list of phone numbers. We just need to find the phone number the user chose.
1 2 3 4 5 6 7 8 9 10 11 |
|
Finally, the finish function, which can be called from either path and creates
the UserLogin
object. In addition to creating the object, we also send our
webhook URL to Twilio. We'll define the GetWebhookURL
function later when
implementing ReceiveMessages
.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 |
|
Cancel
Cancel is called if the user cancels the login process. For networks that create some sort of connection, you should tear it down here. Since Twilio doesn't have any such connections, we don't need to do anything.
1
|
|
Note that this method is not called at the end of the login, nor if the login process returns errors. In both of those cases, you need to disconnect yourself. Errors returned by any step of the process are treated as fatal. If you want to prompt the user to retry, you should return another login step with the appropriate instructions. This is also how refreshing QR codes should be done.
Bridging messages
With everything else out of the way, let's get to the main point: bridging messages between Twilio and Matrix.
Twilio → Matrix
To receive messages from Twilio, we need to implement the ReceiveMessages
function that we created to handle HTTP webhooks. Before that, let's define the
webhook URLs, which was used in the login section.
We don't need to check if Matrix
implements MatrixConnectorWithServer
,
because we already validated that in Start
. We can just cast it to access
GetPublicAddress
and then append our path. We include the user login ID in the
in order to correctly route incoming webhooks.
1 2 3 4 |
|
The ReceiveMessages
function contains a lot of boilerplate code that the
Twilio library could handle, but doesn't. The main bridge-specific code is
finding the user login based on the path parameter.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 |
|
Finally, we need the actual handling function. All we really need to do is pass the event to the central bridge. To do that, we need to extract metadata like the portal and message IDs, and provide a converter function to actually convert the message into a Matrix event.
Simple connectors can use the types in the simplevent
package as remote
events, but for more complicated connectors, it often makes sense to create an
interface which implements bridgev2.RemoteEvent
. That way, the interface
methods can figure out the appropriate data to return, instead of having to fill
a struct for every different event type.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 |
|
Let's go over each of the fields we're filling:
Type
is the event type. It's a normal message.LogContext
is a function that adds structured fields to the event handler's zerolog logger. By default, the logger only has the portal key and user login ID, so other things should be added here.PortalKey
is the ID of the chat. This is a combination of a portal ID and an optional "receiver". Receivers can be used to segregate portals, so that if multiple logged-in users have the same chat, they'll still get separate portal rooms. Most networks should use receivers for DMs, but it is also possible to use them for all rooms if you don't want any portals to be shared. If there's no receiver, then users will be added to the same Matrix room.Data
is the event data itself. This is only here so that it can be passed to the message convert function.CreatePortal
tells the central bridge module that we want it to create a portal room if one doesn't already exist for the given portal key. The bridge will then callGetChatInfo
to get the info of the chat to create.ID
is the message ID.Sender
is the sender of the message. For networks where the user can send messages from other clients, you should also fillIsFromMe
and/orSenderLogin
appropriately. For Twilio, we'll just assume you can't send messages from other clients (we don't support receiving those anyway), so we don't need to fill anything else thanSender
.Timestamp
is the message timestamp. Twilio doesn't seem to provide timestamps, so we just declare that the message was sent now.ConvertMessageFunc
is a function that getsData
, thePortal
object as well as aMatrixAPI
and returns the Matrix events that should be sent.
The convert message function is very simple, as we only support plain text
messages for now. If you wanted to bridge media, you'd download it from the
remote network and reupload it to Matrix using intent.UploadMedia
.
1 2 3 4 5 6 7 8 9 10 11 |
|
Matrix → Twilio
To receive messages from Matrix, we need to implement the HandleMatrixMessage
function that we skipped over in the network API section. Responding is very
simple, we just call the Twilio API and return the message ID.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
|
Bonus feature: starting chats
We've implemented everything that's strictly necessary for a bridge to work, but
let's add one optional feature on top: creating new portal rooms. To do this,
we'll add another interface assertion for TwilioClient
:
1
|
|
The interface requires us to implement the ResolveIdentifier
method, which is
used for both checking if an identifier is reachable and actually starting a
direct chat. There are further optional interfaces for creating group chats with
resolved identifiers, but we don't support group chats at all here, so let's
stick to DMs.
The function just gets a raw string which is provided by the user. If you wanted to make a fancy Twilio bridge, you'd probably use the lookup API to get more info about the phone number, but we'll just do basic validation to make sure the input is a number.
After validating the number, we'll get the ghost and portal objects as well as
their info. The info for both is the same shape as the GetUserInfo
and
GetChatInfo
methods, so we'll just call them instead of duplicating the same
behavior.
We don't actually care about the createChat
parameter here, because Twilio
doesn't require creating chats explicitly. For networks which do require
creating chats, you'd need to use the bool to decide whether it should be
created or not. The Chat
field in the response is mandatory when createChat
is true, but can be omitted when it's false.
We also don't create the portal room here: the central bridge module takes care of that using the info we return.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 |
|
Technically we don't even need to get the portal object: the central bridge
module would get it automatically based on PortalID
if Portal
is omitted.
However, I wanted to reuse GetChatInfo
and didn't want to refactor it to take
a plain PortalID
instead of a whole Portal
.
That's everything needed from the connector to enable starting chats. With that
function implemented, the resolve-identifier
and start-chat
bot commands
as well as the corresponding provisioning APIs will work.
Main function
Now that we have a functional network connector, all that's left is to wrap it
up with a main function. The main function goes in cmd/mautrix-twilio/main.go
,
because it's a command called mautrix-twilio rather than a part of the library.
The main file doesn't need to be particularly complicated. First, we define some
variables to store the version. These will be set at compile time using the -X
linker flag. We'll go over the exact flags in the next section.
1 2 3 4 5 |
|
Then, we make the actual main function, which just creates a BridgeMain
, gives
it an instance of the connector, and runs the bridge.
1 2 3 4 5 6 7 8 9 10 11 |
|
That's it. The mxmain
module is designed to wrap all the parts together to
produce a traditional single-network bridge.
Building the bridge
To build the bridge, you can simply use go build ./cmd/mautrix-twilio
in the
repo root directory. However, to be slightly fancier, we also want to fill the
version info variables that we added to main. To do that, we'll make a script
called build.sh
in the repo root.
1 2 3 4 |
|
Let's break it down:
The first line gets the version of mautrix-go in use by somewhat crudely parsing
the go.mod
file. Since there's a lot of code from mautrix-go being used, it's
useful to have the exact commit embedded rather than having to figure it out
based on the bridge version.
The second line defines all the linker flags. -s
and -w
are standard flags
to strip debug information and DWARF symbols, respectively. They make the binary
smaller, but also make it harder to debug using debuggers. Generally all you
need in production is stack traces, and fortunately those remain intact.
Each -X
flag sets the value of a variable in the binary. We set four variables:
- If we're on a Git tag, we want to set
main.Tag
to the tag name. Otherwise, it's set to an empty string. To do this, we want both--exact-match
(don't output anything unless we're on a tag) and--tags
(consider all tags instead of only annotated ones). If you use annotated tags for releases, you may want to remove--tags
. The2>/dev/null
part is needed to suppress the error message when we're not on a tag. Commit
is fairly straightforward, it's just the commit hash of the current commit (HEAD
), which is easiest to find usinggit rev-parse HEAD
.BuildTime
is the current time in ISO 8601/RFC3339 format.- Finally, we set
GoModVersion
inside mautrix-go to the version we extracted fromgo.mod
.
With the linker flags defined, the last step is to actually call go build
with those flags and tell it to build our command. The "$@"
at the end passes
any arguments given to the script to the go build
command. For example, if you
wanted to output to a different path, you could use ./build.sh -o example.exe
.
Running the bridge
Finally, we have a bridge, it's compiled, all that's left is to run and use it. At this point, you can pretty much just follow the docs starting from the "Configuring and running" part.
- Generate the example config using
./mautrix-twilio -e
(it will be saved toconfig.yaml
). - Edit the config like any other bridge.
- Generate the appservice registration with
./mautrix-twilio -g
. - Pass the appservice registration to your homeserver.
- Run the bridge with
./mautrix-twilio
.
After the bridge is running, start a chat with the bridge bot, and send login
to start the login process. Then send your API keys as instructed, and you're
good to go!
Running with Beeper
If you're using Beeper, you can skip steps 1-4 and just tell bbctl to generate a megabridge config:
1
|
|
Twilio is not the optimal example for this, as bbctl is optimized for bridges
that don't require public HTTP endpoints, while Twilio does require one. You
can get it to work by tweaking the config (specifically, disabling websocket
mode and adding public_address
). It should work more out of the box with
bridges that don't need a HTTP server.
Conclusion
If you want to ask anything related to mautrix-go or this post, feel free to join #go:maunium.net. This post also accepts pull requests.