My current task to perform binary classification using both numerical and ordinal categorical data.
For this task, I effectively replicated the code from the following Tensorflow tutorial: https://www.tensorflow.org/tutorials/structured_data/feature_columns
Currently, I am struggling to find a reason why the following error occurs and how to fix it.
Error:
ValueError: slice index 0 of dimension 0 out of bounds. for 'strided_slice' (op: 'StridedSlice') with input shapes: [0], [1], [1], [1] and with computed input tensors: input[1] = <0>, input[2] = <1>, input[3] = <1>.
Code:
from tensorflow import feature_columnfrom tensorflow.keras import layersfrom sklearn.preprocessing import StandardScaler, PolynomialFeatures# A utility method to create a tf.data dataset from a Pandas Dataframedef df_to_dataset(dataframe, shuffle=True, batch_size=32): dataframe = dataframe.copy() labels = dataframe.pop('target').astype('float64') ds = tf.data.Dataset.from_tensor_slices((dict(dataframe), labels)) if shuffle: ds = ds.shuffle(buffer_size=len(dataframe)) ds = ds.batch(batch_size) return dsscaler = StandardScaler()vals = ["Low", "Med", "High"]# Categorical data preparation # Creativity, Productivity, Optimism, Pessimism is ordinal categorical --> numericalfor c in features[2:]: if c != "Creativity": data[c] = pd.Categorical(data[c], categories = vals, ordered = True) data[c] = data[c].cat.codes / 2 else: data[c] = pd.Categorical(data[c], categories = ["No", "Yes"], ordered = False) data[c] = data[c].cat.codes.astype('float64')data.loc[:, ["Social", "Exercise"]] = scaler.fit_transform(X = data.loc[:, ["Social", "Exercise"]].values)# Splitting Datatrain, test = train_test_split(data, test_size=0.2)train, val = train_test_split(train, test_size=0.2)# Tried reducing batch size as per a solution to a similar StackOverflow query batch_size = 2train_ds = df_to_dataset(train, batch_size=batch_size)val_ds = df_to_dataset(val, shuffle=False, batch_size=batch_size)test_ds = df_to_dataset(test, shuffle=False, batch_size=batch_size)feature_columns = []# numeric colsfor header in features: feature_columns.append(feature_column.numeric_column(header))# This is to concatenate all of these features for each example into single vectorsfeature_layer = tf.keras.layers.DenseFeatures(feature_columns)# Model Architecturemodel = tf.keras.Sequential([ feature_layer, layers.Dense(128, activation='relu'), layers.Dense(128, activation='relu'), layers.Dropout(.1), layers.Dense(1, activation='sigmoid')])model.compile(optimizer='adam', loss=tf.keras.losses.BinaryCrossentropy(from_logits=True), metrics=['accuracy'])model.fit(train_ds,validation_data=val_ds, epochs=10)
Data
Every feature: ['Creativity', 'Productivity', 'Optimism', 'Pessimism']A batch of Creativity: tf.Tensor([0. 0.], shape=(2,), dtype=float64)A batch of Productivity: tf.Tensor([0.5 0.5], shape=(2,), dtype=float64)A batch of Optimism: tf.Tensor([-0.5 -0.5], shape=(2,), dtype=float64)A batch of Pessimism: tf.Tensor([-0.5 -0.5], shape=(2,), dtype=float64)A batch of targets: tf.Tensor([0. 0.], shape=(2,), dtype=float64)
Any help on understanding where this error has come about and how to resolve this would be fantastic!
**EDIT: ** After running this in a Google Colaboratory, I get this error, when running model.fit()
ValueError: Feature (key: Creativity) cannot have rank 0. Given: Tensor("sequential_2/Cast:0", shape=(), dtype=float32)