This week, we continue the digest from the Slack discussions.
Update stream #
Jian [Jul 25th at 1:27 PM]
hi there, I'm new to Vitess, now I'm following the user-guide from vitess.io to explore vitess, in update stream section, I notice they have change log, where could I see these change logs so I can have a better understanding of the update stream?
sougou
That's the only documentation we have about the update stream, but we'll be fixing docs for all vitess very soon.
Jian
sure sure, thank you very much!
vamsi
@sougou even if documentation is not ready yet, is there some info you can provide to Jian about where he can see change logs?
sougou
The end to end test can actually be handy. Let me get the link.
sougou
https://github.com/vitessio/vitess/blob/master/test/update_stream.py
sougou
this section in particular https://github.com/vitessio/vitess/blob/master/test/update_stream.py#L222
Jian [1 day ago]
:+1:
Fixing a failed MigrateServedTypes #
Vidhi [2:10 AM]
Hi
for slave rollback, this will work? ./lvtctl.sh MigrateServedTypes -reverse test_keyspace/0 rdonly
sougou [6:36 AM]
yes. that should work
Vidhi [6:48 AM]
If in case some error came during master switch, as for rollback (no reads and writes are happening), if I update the old master end-point in zookeeper . Will it work?
sougou [6:50 AM]
i think you have to manually repair. can you show me where it failed?
Vidhi [6:50 AM]
It didnt failed yet. I havent done the switch. Just want to figure out rollback plan if something went wrong
Can you please elaborate on manually repair. How to do that?
sougou [6:51 AM]
let me look it up
for master switch, what vtctld does is the following:
set a shard control record to disable query service on source master, and issue a refresh which also sets the source master read-only (edited)
then waits for replication to catch up.
Once caught up, it sets the shard control record to enable query service on destination masters, and issue a refresh on destination masters that makes them read-write.
If there is a failure in the middle, you have to manually do or undo the setting of the tablet control
using SetShardTabletControl
command
and then issue a RefreshStateByShard
to the relevant tablets
i'm working on improving this part: https://github.com/vitessio/vitess/pull/4034
sougou
#4034 vreplication: change to use new vt.vreplication
This change deprecates vt.blp_checkpoint in favor of vreplication, which stands for Vitess Replication.
The goal is to make vreplication a standalone sub-module of vitess that other services can use, including the resharding worflow.
The big change in the model is that vreplication is not owned by the resharding workflow. The workflow instead creates vreplication streams as needed, and controls them individually. The stream id for a replication is now generated by vreplication, which the resharding workflow stores and tracks.
This also means that a vreplication stream can be directly created and managed by anyone as needed. This allows for newer and more flexible workflows in the future._
Vidhi [7:00 AM]
Can you share the complete command to do these steps. I coulnt find it vitess docs
sougou [7:01 AM]
vtctl -h gives me this: SetShardTabletControl [--cells=c1,c2,...] [--blacklisted_tables=t1,t2,...] [--remove] [--disable_query_service]
to enable query service, you probably should use --remove
to disable --disable_query_service
I haven't used these myself. So, you should test them out yourself to make sure they work as intended.
You can try it out on the source master while it's serving queries to see if it stops serving
and re-enable it with --remove
Vidhi [7:04 AM]
Sure, will try this setup on stage first.
Thank you very much for the help :)
Reading from replicas
skyler [Jul 31]
Does vtgate support rewrite rules similar to ProxySQL? We’re using ProxySQL to send queries to a replica if it’s not too laggy.
Does vtgate, or some other component of the stack, support something similar?
I haven’t found much in docs and via google, so I assume no, but I thought I’d ask anyway.
sougou
@skyler can you give an example?
skyler
The actual config is pretty lengthy, but what we’re doing is matching for the string /*SLAVE OK*/
at the beginning of every query. If that string exists, then we route a query to a read replica if it’s replication lag is less than some threshold. If a replica’s replication lag is greater than the threshold, ProxySQL “shuns” it, which means that it removes the replica from the list of replicas that are available for querying.
sougou
this is supported differently by vitess
sougou
you can specify db name as db@replica
sougou
and the tolerances you mention can be specified to vttablet
skyler
Oh interesting, that’s very cool.
Reconstructing zk data #
vamsi [Jul 31st]
Do people who use vitess with ZK generally backup ZK data regularly? If not, what would happen if ZK data is somehow corrupt or if ZK dies for some unexpected reason?
sougou
zk data can be reconstructed if needed.
it's mostly metadata about keyspaces and shards
but it's still a good idea to back it up
vamsi
any tools that can reconstruct it?
sougou
to manually reconstruct? they would be the vtctl commands like CreateKeyspace
etc.
You could probably write a shell script to do this
Will be interesting if we could do a feature that generates this.
ameet
@vamsi we are using consul. We backup the vitess metadata every 30 mins. It has saved us at least once where an operator deleted the metadata by mistake. Also, we manually backup before doing the cutover operation for a shard split
sougou
If you loose all data. I think these steps will also work:
Recreate all the cells
Restart all vttablets
Perform
TabletExternallyReparented
on all master tablets
Your system should be pretty much restored to the old state.
Are primary keys needed #
faut [Aug 1st]
Is it imperative for tables to have a primary key in vitess?
derekperkins
it’s pretty much imperative in MySQL to have a PK, but I don’t think Vitess adds any more need for it. Are you wanting to run sharded or non-sharded?
faut
non-sharded. We have some tables that don’t have PKs, and vitess throws cannot identify primary key of statement
on updates and inserts.
sougou
it will work if you change mysql to RBR
Can sequence tables be in a sharded keyspace #
captaineyesight [Aug 1st]
Hi. I’m looking at sequences and I’m a little confused. Lets say I have a sharded cluster: foo 00-80 and foo 80-FF. In foo, I have a table named bar that has a lovely vschema that splits it between shards. Where does the bar_seq table go? 00-80 or 80-FF or should it be in a completely different place?
weitzman
The sequence table does not need to be in the same keyspace. The vitess examples tend to use a keyspace called “lookup” or something like that
The sequence table only has one row, so if you put it in the same keyspace it would end up in whatever tablet the primary key “0” maps to
If someone really didn’t want to go through the trouble of having multiple keyspaces there might be an argument to do that, but under normal circumstances you’d probably want the sequences in an unsharded keyspace
captaineyesight
thanks
sougou (update)
Submitted https://github.com/vitessio/vitess/pull/4134: vschema: allow pins in vschema. This allows you to pin a table to a specific shard by assigning a keyspace id to it.
Creating replicas for devs #
faut [Aug 2nd]
What are the suggestion for devs in minikube and simulating the effects of vitess (Assuming they will just run mysql with a DB named the same as the keyspace? So if they write toxic queries they know before they get to a staging environment etc. And is it possible to dump a keyspace(sharded/unsharded) so you can replicate that in a standalone mysql? ie: Is it possible to migrate out of vitess? (edited)
sougou
@faut I don't fully understand the question. Are you talking about migrating into vitess, or out?
To migrate out, you can just start sending queries directly to the mysql instances and tear down the vitess components. You could also replicate the data out and failover.
faut
:+1: Makes sense. But we’d need to rebuild/revert all the sharding?
sougou
Or reimplement sharding at the app layer If mysql can handle, you can also merge back all the shards into one
faut
And do you have suggestions for how to ‘replicate’ the database for devs. Or what to do for a dev environment? running vitess locally seems overkill.
sougou
If it's just to make the data available to devs, you can always setup a standalone replica from a vitess master.
faut
How can I restore that standalone from the backups created from vitess backup?
sougou
yeah. you can restore from those backups and point the restored db to the master if you're lazy, you could make vitess do it for you
bring up a replica vttablet. once it's brought up, kill just the vttablet (and delete its tablet record)
faut
is manually restoring the data just a case of copying the GCS bucket to datadir?
sougou
i believe so (don't know the mechanism for GCS) vitess copies the data files into the datastore as files so, if bucket==file, it should work the same way in reverse
faut
cool. Then theoretically I should be able to make a backup by just copying the files there. Then restoring from that on vttablet.
sougou
should work
faut Thank you, I’ve got a couple of ideas I will try.
Hackathon! #
raj.veerappan [Aug 2nd]
Another question on https://vitess.io/docs/overview/scalingwithvitess/#migrating-production-data-to-vitess
In that approach, you'd enable MySQL replication from your source database to the Vitess master database.
In the replication approach, does "Vitess master database" mean use the VTGate as the replication slave? Or the VTTablet of the master or the mysql of the master? If it's mysql of the master, does that populate the schema properly in Vitess?
faut
hey raj, if you’re planning to do a production migration to vitess maybe we can chat. We’re also planning to move to vitess so we’re struggling through similar issues.
raj.veerappan
I'm just doing this for a hackathon to prove things out and see if it'll work for us
sougou
People have adopted more approaches than those mentioned in that write-up. We need to update it with the new strategies
sougou
Dual-writes seems to be a popular approach
In that particular descrption, I think it meant mysql->mysql
raj.veerappan
what happens to the schema in that case?
I guess I thought updates to the schema have to go through vtgate
sougou
not necessary
even after you're fully migrated to vitess, you can deploy schema changes directly to the mysqls
and people often do, using tools like gh-ost, etc
the ApplySchema
is just a convenience
raj.veerappan
hmm, ok, I made that assumption because one approach I tried was to copy over the data files from my non-vitess mysql to the data directories of the vitess mysql instances. Then when I fired up vtgate and used the mysql command line client to inspect the db, I could see all the tables were there
but when I tried to select rows from a table, vtgate complained that it didn't recognize the table
sougou
ohh. you still need a vschema
something that describes how your shards are layed out
https://vitess.io/user-guide/vschema/
sougou [18 days ago]
if the target db is not sharded yet, the vschema is a simple json that lists the table names
raj.veerappan
nice! thank you, will try that now
sougou
https://github.com/vitessio/vitess/blob/master/examples/demo/schema/lookup/vschema.json
examples/demo/schema/lookup/vschema.json
{
"sharded": false,
"tables": {
"user_seq": {
"type": "sequence"
},
"music_seq": {
"type": "sequence"
},
"name_keyspace_idx": {}
}
}
tables should have no types. the sequence
tables are special case
raj.veerappan
right was gonna say, I didn't think I needed to create those until I sharded things
sougou
vitess will work without a vschema as long as there's only one keyspace, because it knows there's only one
as soon as you have more than one, it needs to know where to route the queries
raj.veerappan
when you say work without a vschema, will it function purely as a "connection pool" or will it still need to parse the queries and will only support the statements it supports?
sougou
it will still do some work, but most queries will just be passed through
raj.veerappan
one of the reasons I tried copying over the data files directly was that when I tried restoring from a mysqldump vtgate complained that it couldn't handle one of the insert statements to a many-to-many mapping table because it didn't understand the primary key
sougou
it's probably because the mysqls are setup as SBR
we recommend RBR now. Hopefully we can deprecate SBR support soon :slightly_smiling_face:
raj.veerappan
oh interesting, I didn't realize that would affect mysqldump
faut
would the vschema tables just be: tables: { user: {} }
?
sougou
"user":...
yeah
raj.veerappan
will retry importing using mysql dump after switching all the vitess instances to RBR, seems easier than creating that json
faut
raj, are you working in GCP or baremetal?
sougou
if it's a single keyspace, you shouldn't need that json (irrespective of how you do the import)
raj.veerappan
@sougou I think I may just be in a weird state right now because the mysql import failed halfway, will start over after wiping things out and see if I can just copy the data files over without doing anything with vschema
raj.veerappan
@faut I'm just doing baremetal for the hackathon, if we start using it in production it would be with k8s/AWS (edited)
faut I had the same problems when I mounted the datadir for a single database. It showed all the tables if i did show tables
it showed everything. But any query would say. the table didnt exist. Even direct to mysql
sougou
it may be related to vttablet not having reloaded the schema
vttablet reloads the schema every X minutes
faut
I did a vschema reload. But the problem is with mysql. Because even when querying directly it would fail
sougou
this is vttablet seeing the table. vschema is for vtgate (edited)
raj.veerappan
vttablet reloads the schema every X minutes
is there a way to force this?
sougou
yeah. vtctl ReloadSchema
faut
raj, if you come right with the datapath mounting please let me know. I couldn’t get it to work
sougou
there is a way to make vttablet auto-detect by making it watch the replication stream. most people prefer not to use that feature
i think the flag is -enable_replication_watcher
(not at my comp)
raj.veerappan
I wiped everything out and restarted and copied the data files over, when I login through vtgate I see the tables but in the UI for vitess the schema says empty and I'm not able to select from any of the tables in mysql client connected to vtgate
did vtctl ReloadSchema
against my master vttablet but the schema did not populate in web UI
so will try using the json and enumerate the table names
actually, will switch all the vitess mysql instances to RBR and try loading from mysqldump first
nice, that seems to be the way to go, only problem now is that our mysqldump has tables with foreign key constraints on tables that are defined further down in the dump and vtgate doesn't like that, will need to edit the dump and reorder the create table statements
sougou
whatever works :slightly_smiling_face:
Raj.veerappan
problem is that it seems like vtgate does not support disabling foreign key checks for loading from dump
raj.veerappan
even trying to disable for session throws
`mysql> set foreign_key_checks=0;
ERROR 1105 (HY000): vtgate: http://localhost:15001/: unsupported construct: set foreign_key_checks=0`
(edited)
raj.veerappan
well, I found a janky workaround that makes this easy, create a schema only mysqldump, open up mysql cli onto vtgate, run source
repeatedly until the table count stabilizes. Then source your data only dump, super janky but it works for my hackathon :slightly_smiling_face:
I made it work the proper way, didn't realize I just needed to load the mysqldump directly against the vitess mysql master instance and reloadschema and everything would "just work"
sougou
yeah. that would be the best.
faut
The problem for me with the mysqldump is the downtime. Snapshotting a disk and using it as a mount is much quicker. I have got things to work with the mysqldump. Just trying to figure out the best way to migrate in production.
Configuring the app to use VTGate #
Sean Gillespie [Aug 2nd]
Is there documentation on setting up an app to use vtgate? I can’t find much beyond saying the apps can use it like MySQL
sougou
there's not much to it. just point the app at vtgate on the mysql port
https://vitess.slack.com/archives/C0PQY0PTK/p1527271545000268
Command to connect to vtgate: mysql -h 127.0.0.1 -P 15306 -u mysql_user --password=mysql_password
Posted in #vitess May 25th
if you have many vtgates, you can put them behind an ELB
Sean Gillespie
Where do you set the user/pass?
sougou
in a credentials file like this https://github.com/vitessio/vitess/blob/master/examples/local/mysql_auth_server_static_creds.json
{
"mysql_user": [
{
"MysqlNativePassword": "*9E128DA0C64A6FCCCDCFBDD0FC0A2C967C6DB36F",
"Password": "mysql_password",
"UserData": "mysql_user"
}
],
"mysql_user2": [
{
"Password": "mysql_password",
"UserData": "mysql_user"
}
],
"mysql_user3": [
{
"MysqlNativePassword": "*9E128DA0C64A6FCCCDCFBDD0FC0A2C967C6DB36F",
"UserData": "mysql_user"
}
],
"vt_appdebug": [
{
"Password": "vtappdebug_password",
"UserData": "vt_appdebug"
},
{
"SourceHost": "localhost",
"Password": "",
"UserData": "vt_appdebug"
}
]
}
and give that to vtgate (look at vtgaet-up,sh) in that same directory
Overriding the db name #
raj.veerappan [Aug 2nd]
unfortunately looks like flyway relies on information_schema
for a bunch of logic and that's not available through vtgate
sougou
if you connect to a specific shard, vtgate will pass it through
it should be an unsharded keyspace, or something like ks:-80
raj.veerappan
but then the db name will be vt_db
instead of just db
I'll just disable flyway for now since migrations will probably need to be reworked if we use vitess
sougou
you have another option
you can override the dbname
vttablet command line -init_db_name_override
(edited)
and name the db as db
instead of vt_db
raj.veerappan [18 days ago]
lol, that might simplify things
Overriding the db name
raj.veerappan [Aug 2nd]
Seems like the ./lvtctl.sh CopySchemaShard test_keyspace/0 target/0
doesn't work if test_keyspace
has tables with foreign keys in it
sougou
yeah. You can do a custom schema deploy in that case
it's only a convenience
raj.veerappan
is there a gist for that too :slightly_smiling_face:
I guess I only need to deploy the schema for the particular tables that I'm vertically sharding?
will just do a show create table
on it on test_keyspace and just run directly using mysql on target
sougou
yup
raj.veerappan
if vtworker cannot find MASTER tablet for destination shard for target/0
even though I did the InitShardMaster
step, is there something else I need to do?
I see the target
keyspace in the web ui with its shards and one tagged as master correctly
sougou
check the status page for vttablet /debug/status
and the logs. Maybe it didn't initialize correctly
raj.veerappan
status is healthy
sougou
and it shows up as master in vtctld?
raj.veerappan
yes
sougou
the vtworker would have written a logfile
can you see if it has more info there?
can you also show me your vtworker command?
raj.veerappan
./sharded-vtworker.sh VerticalSplitClone --tables my_table target/0
will check the log file
the only error besides the cannot find MASTER...
one is proc.go:85] unexpected error on port 0: Get http://localhost:0/debug/pid: dial tcp [::1]:0: connect: can't assign requested address, trying to start anyway
sougou
what is the full error? (that error can come from three different places)
raj.veerappan
ohh, just noticed that it was in a cell that doesn't match mine
ahh, I updated the cell name in the other scripts but not in sharded-vtworker.sh
sougou
that will do it :slightly_smiling_face:
raj.veerappan
that was it :slightly_smiling_face:
been at it all day, starting to miss things
sougou [18 days ago]
don't forget about MigrateServedFrom
(not MigrateServedTypes
)